📄 phelix.tex

📁 phelix加密算法源代码,是一个开源的加密算法
💻 TEX
📖 第 1 页 / 共 4 页
字号:
\section{Other modes of use}
So far we have described Phelix as providing both encryption and
authentication. Phelix can be used in other modes as well. For any particular
key, Phelix should be used with a single mode. Using several modes with a
single key can lead to a loss of security.

\subsection{Additional Authentication Data (AAD)}
In packet environments, it is often desirable to authenticate the packet header
without encrypting it. From the encryption/authentication layer, this looks
like an additional string of data that is to be authenticated but not
encrypted.

Let $\ell(A)$ be the number of such unencrypted header bytes to be
authenticated. Phelix defines a simple extension for processing AAD that
seamlessly reverts to the non-AAD case when $\ell(A) = 0$, defined as follows.

The AAD is padded
with 0--3 zero bytes until the length is a multiple of four.  After cipher
initialization (section~\ref{sec:Initialization}), the padded AAD blocks are
``encrypted'' as plaintext blocks, but all the associated ciphertext words are
discarded. However, both before the first AAD block and after the final AAD
block, the state variable $Z_1$ (i.e., $Z^{(0)}_1$ and $\T\B Z^{(\lceil
\ell(A)/4 \rceil)}_1$) is \xored with the 32-bit hex constant 0xaadaadaa.
Next, the input message data is encrypted (or decrypted) as a normal message
through Phelix, except that the starting block number is $\lceil \ell(A)/4
\rceil$.  At this point, after the last message word is processed (i.e., at
the same time $Z_0^{(k)}$ is \xored with the value 0x912d94f1, as in
section~\ref{sec:ComputeMAC}), the following steps are also performed, with 
$ k = \lceil \ell(A)/4 \rceil + \lceil \ell(P)/4 \rceil$:
%
\begin{align*} 
Z^{(k)}_2  &:= Z^{(k)}_2  \xor \lfloor \ell(A)/2^{32} \rfloor \\ 
Z^{(k)}_4  &:= Z^{(k)}_4  \xor (\ell(A) \bmod 2^{32})
\end{align*}
% 
After these updates to the internal state, the final MAC processing proceeds
as in section~\ref{sec:ComputeMAC}, resulting in a MAC tag computed over the
AAD and plaintext.

\subsection{Truncated MAC Values}
To reduce bandwidth consumption, it is often desirable to use a MAC value
shorter than 128 bits. For example, in IPsec packets, MAC values (or ``tags'')
are typically truncated to 96 bits. Phelix allows the use of truncated MAC
tags, with some restrictions discussed below.  If desired, a single Phelix
key may be used with both MAC truncation and AAD without affecting security.

Let $t$ be the desired size, in bits, of the MAC tag, with $0 < t \leq 128$.
The definition of $X'_i$ in section~\ref{sec:BlockKeys} is then extended, for
the case $i \bmod 4 = 1$, to $X'_i := 4\cdot\ell(U) + 256\cdot(t \bmod 128)$.  In
other words, this technique embeds both $\ell(U)$ and $t$ into different bit
fields within $X'_i$. Note that the extended definition seamlessly reverts to
the non-truncated case when $t=128$.

After encryption, the truncated MAC tag consists of the first $t$ bits of the
16-byte MAC value. More formally, let the untruncated MAC bytes be
$(m_0,\ldots,m_{15})$.  The truncated MAC bits are then the bytes
$$(m_0,\ldots,m_{\lfloor (t-1)/8 \rfloor- 1}\;,
\;m_{\lfloor (t-1)/8 \rfloor} \bmod 2^{1+((t-1)\bmod 8)}).$$

Given the nature of Phelix, where plaintext can affect the internal cipher
state, MAC truncation must be handled prudently.  As an extreme example
in the absence of any restrictions, if a tag length $t=1$ were negotiated, an
attacker could submit forged packets with a repeated nonce value and have them
successfully decrypted with probability 0.5, probably opening the door to
devastating key recovery attacks.  Fortunately, truncated Phelix MACs can be
used safely, but only under certain restrictions:

\begin{itemize} 

\item The receiver must implement an anti-replay strategy, so that multiple 
packets with the same nonce will never be decrypted. Since anti-replay is
almost always a required policy anyway, this restriction is not
onerous.

\item The tag length $t$ must be negotiated together with the key $U$, and the
two values must always be bound together.  The receiver must always compare
all $t$ bits of the MAC tag before ``approving'' a packet and revealing its
plaintext.  In other words, an attacker must not be allowed to submit packets
to be encrypted or decrypted with a tag length $t$ different from that negotiated
when the key $U$ was selected.

\item Since an attacker will be able to mount an attack due to tag truncation 
only if he is able to have forgery attempts successfully decrypted, the tag
length $t$ must be large enough that there is no acceptable
probability of having the receiver accept and decrypt a forged packet within
the lifetime of the system.

\end{itemize}

This last restriction deserves some further discussion.  For example, suppose
that the receiver can verify and decrypt no more than $2^q$ packets per second,
and that the desired security lifetime is 100 years, or approximately $2^{32}$
seconds. The probability of a forgery succeeding in the lifetime is then $p
\approx 2^{q+32-t}$, so a value $t > q+64$ is required to guarantee $p <
2^{-32}$. As a particular case of interest, on a 10-Gbit/sec link using IPsec
with a minimum packet size of 64 bytes, we find $q \approx 21$, so we require
$t > 85$ to achieve $p < 2^{-32}$.  In practice, of course, it is almost
certain that a successful attack would require considerably more than a single
forged packet, so larger values of $p$ may well be acceptable. For example, a
less demanding hypothetical system might use an 802.11 wireless link with $q <
2^{14}$. If the goal is only $p < 2^{-8}$ (so that the probability of getting
more than four forged packets is easily less than $2^{-32}$) and the expected
system lifetime is 25 years, then the requirement could be met with $t > 54$.

Another common approach to enforcing such restrictions is to close a Phelix
``session'' (and negotiate a new key) after more than a certain number of
forgery attempts have been detected by the decryptor. Alternately, a policy of
rekeying after a certain number (e.g., $2^{32}$) of packets have been
processed will shorten the lifetime of a key and thus limit the probability of
a forgery being accepted. The particular settings chosen for any such policies
should be carefully chosen as part of the system design in order to meet the
desired security goals without affecting performance.

As a general guideline, if truncated Phelix tags must be used, values of $t <
96$ should be considered only after careful analysis, and values of $t <64$
are strongly discouraged in all cases. 

\subsection{Pure stream cipher, PRNG, and PRFG}

Phelix can be used as a pure stream cipher by ignoring the MAC computations at
the end.  However, to avoid trivial decryption attacks as discussed
above, this must not be done by simply truncating the MAC size to $t=0$.
Instead, stream-cipher Phelix should first encrypt the all-zero plaintext, with
the resulting Phelix ciphertext stream used as the stream-cipher keystream.  For
interoperability, this keystream should be generated with the full tag size
$t=128$.

Like any stream cipher, Phelix is a cryptographically strong
pseudorandom number generator. For every (key,nonce) input, it produces a
stream of pseudorandom data. This makes Phelix suitable for use as a PRNG.
Similarly, Phelix can be viewed as a Pseudorandom Function Generator:
given a secret key and encrypting an all-zero plaintext, the nonces define
different outputs, indistinguishable from uniformly chosen independent
bit strings. 

\subsection{MAC with Nonce}
Phelix can also be used a pure MAC function. The data to be authenticated is
encrypted, but the ciphertext is discarded. The receiver similarly discards
the keystream and just feeds the plaintext to the Phelix rounds. In this mode
Phelix is significantly faster than, for example, HMAC-SHA1, but it does
require a unique nonce for each message. Unfortunately, it is insecure to use
Phelix with a fixed nonce value.


\section{Design rationale}

Although the design strength of Phelix is 128 bits, we use 256-bit keys. This
avoids a very general class of attacks that exploits collisions on the key
value. For flexibility, Phelix also allows shorter keys to be used, as there
are many practical situations in which fewer than 256 bits of key material
are available.

The small set of elementary operations that Phelix uses makes it efficient on
a large number of software platforms. The absence of tables, variable rotations,
and multiplications makes Phelix small and efficient in hardware as well.

The number of active state words (i.e., 5) was chosen as the maximum state
that can easily fit in the registers of mainstream Intel CPUs. The number of
old state words (4) used for output ``masking'' was chosen so that the total
amount of unknown internal state is always at least 256 bits, even after the
keystream output.

Most ciphers use lookup tables to provide the necessary nonlinearity. In
Phelix, the nonlinearity comes from the mixing of \xors with additions.
Neither of these operations can be approximated well within the group of the
other. 

The diffusion in Phelix is not terribly fast, but it is unstoppable. As the
attacker has very little control over the state, it is not possible to limit
the diffusion of differences. In those areas where dynamic attacks are
possible, we use a sequence of 8 blocks to ensure thorough mixing of the
state words.

The key mixing is an unkeyed bijective function. The purpose is to spread
the available entropy over all key words. If, for example, the key is
provided by a SHA-1 computation, then only 5 words of key material are
provided. The key mixing ensures that all 8 key words depend on the key
material. Using a bijective mixing function ensures that no two 256-bit
input keys lead to the same working key values.  The use of the input key
length in $X'$ guarantees that even keys that lead to the same working key
(each short key leads to a working key that is also produced by a 256-bit
key) do not lead to equivalent Phelix encryptions.

\subsection{Key schedule}\label{sec:Rat-keyschedule}
The $X_{i,0}$ values simply cycle through the key words. The $X_{i,1}$
values depend on the same key words in anti-phase, the extended nonce words,
the block number, and the input key length. This key schedule has a number
of properties. All 8 key words and all 4 nonce words affect the state
every 4 blocks.

The key schedule also ensures that different $(K,N)$ pairs produce different
block key sequences. Even stronger: no sequence of 17 key words ever occurs
twice across all keys, all nonce values, and all positions in the encryption
computation.

To demonstrate this, we look at the sequence $W_j := X_{\lfloor j/2 \rfloor,
j \bmod 2}$. This is the sequence of key words in the order they are used.
Given just part of the sequence $W_j$, without the proper index values $j$,
we can recover the key, nonce, and block number. (When the plaintext word is
zero, the first half of the block function is identical to the second half of
the block function, so it makes sense to look at the sequence $W_j$ and
allow half-block offsets.)

If $W_j = W_{j+16}$, then $j$ is even; otherwise, $j$ is odd. This allows us
to split the $Y$ values back into an $X_{i,0}$ and $X_{i,1}$ sequence.

Now consider
\begin{align*}
D_i :={}& X_{i,1} - X_{i,0} + X_{i+4,1} - X_{i+4,0}\\
{}={}& N_{i \bmod 8} + N_{(i+4) \bmod 8} + X'_i + X'_{i+4} + 2i + 20\\
{}={}& (i \bmod 4) + 2i + 20 + X'_i + X'_{i+4}
\end{align*}
all modulo $2^{32}$. We first look at $D_i \bmod 4$. The $X'$ terms can only
have a nonzero contribution if $i \bmod 4 = 3$, so 3 out of 4 consecutive
times we get just $((i \bmod 4) + 2i) \bmod 4 = 3i \bmod 4$, which gives us
$i \bmod 4$. Looking at the full $D_i$ for an $i$ with $i \bmod 4 = 0$ gives
us $i \bmod 2^{31}$. The sum $X'_i + X'_{i+4}$ from the case $i \bmod 4 = 3$
gives us the upper bits of $i$. This recovers\footnote{This isn't absolutely
perfect. We do not recover the 62'nd bit of $i+8$, but this bit will only be
set during the very last few blocks of a message very close to $2^{64}$
bytes long. This does not lead to a weakness.} the block number, $i$.

Given $i \bmod 8$, we can recover the working key from the $X_{i,0}$'s.
Knowledge of $i$, and the key words allows us to compute the key length and
the nonce from the $X_{i,1}$'s, as well as check the redundancy introduced
by the nonce expansion to 8 words.

We have not investigated whether it is possible to recover the key, nonce,
and block number from fewer than 17 consecutive key words. A simple counting
argument shows that at least 14 are required. This remains an open problem.

\subsection{Choice of Rotation Counts}
Phelix mixes additions mod $2^{32}$ and bit-wise XORs. 
The rotations provide the diffusion between the various bit
positions in the state words. Without the rotations, the diffusion could only
go into one direction: from the less significant bit positions to the more
significant ones. Thus, the choice of rotation counts is critical for the
security of Phelix. During the design process we examined the
impact of various choices of rotation counts both in terms of attempts to
cryptanalyze the cipher, and also in terms of their impact on statistical
tests of the block function.

To analyze the diffusion properties of a set of rotation counts, consider a
variant of the block function with all the additions changed to \xors.
(This is equivalent to ignoring the carries in the additions.) In this
variant, we can track which output bits are affected by which input bits. For
this analysis, we consider an output bit affected if its computational path
has a dependency on the input bit at any one point, even if the output bit
in our linearized block function is not changed due to several dependencies
canceling out. This seems to be the most suitable way to analyze diffusion
and is related to the independence assumption in differential and linear
cryptanalysis.

A set of rotation counts can, at best, ensure that changing a single state
input bit affects at least 21 bits of the output. There are a large number
(over 6\ 000) of such rotation count sets.

We discarded all rotation count sets that contained a rotation count of 0,
1, 8, 16, 24, or 31. Rotation by a multiple of 8 repeats after only two or
four operations,
and rotation by 1 or 31 bit positions provides diffusion between adjacent
💿 文件大小 572 K
👤 上传用户 hlwang523
📂 所属分类 CA认证
🏷️ 相关标签

#phelix #加密算法 #源代码 #开源
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -