Advanced
Novel Technique in Linear Cryptanalysis
Novel Technique in Linear Cryptanalysis
ETRI Journal. 2015. Feb, 37(1): 165-174
Copyright © 2015, Electronics and Telecommunications Research Institute(ETRI)
  • Received : November 26, 2013
  • Accepted : November 04, 2014
  • Published : February 01, 2015
Download
PDF
e-PUB
PubReader
PPT
Export by style
Share
Article
Author
Metrics
Cited by
TagCloud
About the Authors
Wen-Long Sun
Jie Guan

Abstract
In this paper, we focus on a novel technique called the cube–linear attack, which is formed by combining cube attacks with linear attacks. It is designed to recover the secret information in a probabilistic polynomial and can reduce the data complexity required for a successful attack in specific circumstances. In addition to the different combination strategies of the two attacks, two cube–linear schemes are discussed. Applying our method of a cube–linear attack to a reduced-round Trivium, as an example, we get better linear cryptanalysis results. More importantly, we believe that the improved linear cryptanalysis technique introduced in this paper can be extended to other ciphers.
Keywords
I. Introduction
Linear cryptanalysis is an effective known-plaintext attack (KPA) against block ciphers. At present, KPA has been adapted to stream ciphers [1] [5] . M. Matsui and A. Yamagishi [6] , in 1992, introduced the idea of linear cryptanalysis in an attack on FEAL [7] . The techniques used in this attack were refined by M. Matsui and had a dramatic effect on the Data Encryption Standard (DES). This eventually led to the first experimental cryptanalysis of the cipher being reported in the open community [8] [9] .
Subsequently, several refinements to the basic idea of linear cryptanalysis have been suggested to improve the efficiency of the attacks it envelops, either in specific circumstances or in all cases. In 1994, B.S. Kaliski and M.J.B. Robshaw [10] proposed an extension to linear cryptanalysis based on the use of multiple linear approximations. S.K. Langford and M.E. Hellman [11] , in 1994, introduced the differential–linear attack, which is a mix of both differential cryptanalysis and linear cryptanalysis. In 1996, L.R. Kundsen and M.J.B. Robshaw [12] introduced the idea of extending M. Matsui’s linear cryptanalytic techniques to more general cases in which non-linear relations are also considered. In [13] , zero-correlation linear cryptanalysis, the counterpart of impossible differential cryptanalysis in the domain of linear cryptanalysis, was proposed by A. Bogdanov and V. Rijmen, resulting in a faster attack for some ciphers.
- 1. Motivation and Contribution
As introduced above, there have been several extensions to linear cryptanalysis at present. Nevertheless, is there any available information not yet exploited by previous linear cryptanalysis methods?
In this paper, we answer this question positively. Generally speaking, linear cryptanalysis exploits specific correlations between the input and the output of cryptographic primitives. For almost any cryptographic scheme, each output bit can be described by a multivariate polynomial over GF(2) in both the public variables and the secret variables. As we know, a very powerful tool to recover the secret variables is cube attacks, proposed by [14] . To derive the secret information, the attacker sums this bit over all possible values of a subset of the public variables. The summations are used to derive linear equations in the secret variables, which can be efficiently solved. When launching linear attacks, we can, in specific scenarios, obtain the explicit description of a multivariate polynomial. Since the polynomial’s probability is less than one after one or several approximations in linear attacks, we have to adapt our use of cube attacks in such cases because they are used only on such polynomials when the polynomial’s probability is equal to one.
Here, we combine cube attacks with linear attacks to propose a novel method called cube–linear attack, aiming to recover the secret variables of a probabilistic multivariate polynomial. It uses information that both cube attacks and linear attacks use. Once the secret variables have been recovered, the degree of non-linear monomials involving the recovered bits would decrease and may even fall to zero. This is the same result as desired by linear attacks. Subsequent cryptanalysis results prove our method to be efficient.
In addition to the two combination scenarios (using linear attacks during a cube attack and using cube attacks during a linear attack), two schemes, A and B, are discussed. The data complexity and success rate of each scheme can be calculated only when all the derived polynomials [14] defined by any determined cube index are independent. Otherwise, the situations are too complex to give any concrete computational formula using the existing theories. An introduction to the detailed theory of our method is given in Section III. Afterwards, an improved linear cryptanalysis is proposed using the cube–linear attack as a novel technique. As an application, we cryptanalyze the security of the eSTREAM finalist, Trivium, against linear cryptanalysis. Three linear approximations with the same average bias, 2 −23.2 , are found, and four key bits are recovered for the reduced version of Trivium with the initialization of 288 rounds (out of 1,152). The data complexity is 2 47.2 IVs with 97.8% success rate, improving upon previous linear cryptanalysis results. Although a few better cryptanalysis results on Trivium have been published using other attacks ( [14] , [15] , and [16] ), it’s confirmed that our method is meaningful from the point of view of improving linear cryptanalysis.
For convenience, we particularly follow the relevant concepts and terminology of cube attacks. Although based on the same essence of higher-order differential cryptanalysis, there are some differences between cube attacks and our cube–linear attack. Firstly, cube attacks are applied to such a polynomial having probability one, while our method copes with probabilistic polynomials. Secondly, the primary cost of cube attacks lies in the searching of the appropriate cube indexes, while our method focuses on an explicit polynomial. Moreover, they are two different methods, and our cube–linear attack is proposed as an improved technique for linear cryptanalysis.
- 2. Organization
This paper is organized as follows. In Section II, we briefly review cube attacks and linear cryptanalysis. Afterwards, we describe the theory of our cube–linear attack and propose an improved linear cryptanalysis (Section III). In Section IV, we apply our method to a specific analysis on Trivium. Finally, we make a few concluding remarks in Section V.
II. Cube Attacks and Linear Cryptanalysis
- 1. Review of Cube Attacks
Cube attacks [14] , first formalized by I. Dinur and A. Shamir at EUROCRYPT 2009, are a generic type of algebraic attack and can be applied to any cryptosystem, provided that the attacker has access to a bit of information that can be represented by a low-degree multivariate polynomial over GF(2).
In almost any cryptographic scheme, each output bit, zi , can be described by a multivariate master polynomial over GF(2) comprising public variables v 1 , v 2 , … , vm , which are either bits of the plaintext of a block cipher or bits of the initial vector of a stream cipher and that are dependent upon secret variables x 1 , x 2 , … , xn zi = p ( v 1 , … , vm , x 1 , … , xn ).
To simplify our notation, the distinction between public and private variables is now ignored. Given a multivariate master polynomial with n variables p ( x 1 , … , xn ) over GF(2) in algebraic normal form (ANF) and a term tI containing variables from an index subset I that are multiplied together, the polynomial can be rewritten as the sum of terms that are supersets of I and terms that miss at least one variable from I : p ( x 1 , … , xn ) ≡ tI · P S(I) + q ( x 1 , … , xn ), where P S(I) is called the superpoly of I in p . Note that the superpoly of I in p is a polynomial that does not contain any variables in common with tI , and each term in q ( x 1 , … , xn ) does not contain at least one variable from I .
Any subset I of size s defines an s-dimensional Boolean cube of 2 s vectors, CI , in which we assign all the possible combinations of 0/1 values to variables in I and leave all the other variables undetermined. Any vector v CI defines a new derived polynomial p |v with n s variables (whose degree may be the same or lower than the degree of the original polynomial). Summing these derived polynomials over all the 2 s possible vectors in CI , we end up with a new polynomial, which is denoted by
p I = Δ ∑ p |v
. A maxterm of p is a term tI such that the superpoly P S(I) of I in p is a linear polynomial that is not a constant.
Theorem 1 [14] . For any polynomial p and subset of variables I , we have pI P S(I) mod 2.
Cube attacks have two phases. In the first preprocessing phase, the task is to find as many maxterms and corresponding linear superpolys as possible. In the next phase, the online phase, the attacker solves the system of linear equations obtained and acquires some values related to the secret variables.
- 2. Review of Linear Cryptanalysis
The basic idea behind linear cryptanalysis is to find some linear approximation to the action of cryptographic primitives. In other words, what the attacker exploits are some statistical correlations between the input and the output. For a cryptosystem with k -bit key ( k 1 , k 2 , … , kk ), n -bit plaintext ( p 1 , p 2 , … , pn ), and ciphertext ( c 1 , c 2 , … , cn ), the task of the attacker is to find the index sets I , J , and L such that
jJ p j lL c l = iI k i
holds with probability p = 1/2 + ε , ε ≠ 0.
For the block and stream ciphers, linear attacks are usually executed as follows. First, we look for the linear or nonlinear approximations of different rounds, and then we combine them. From this, we can obtain final linear approximations for the whole cryptosystem with probability calculated according to Lemma 1 (piling-up Lemma).
Lemma 1 [8] . For each value i , 1 ≤ i n , let Xi be a random variable, independent of Xj for all j i , such that P ( Xi = 0) = pi , P ( Xi = 1) = 1 − pi . Then
P( X 1 X 2 X n =0)= 1 2 + 2 n1 i=1 n ( p i 1/2 ) .
Given a linear approximation, it is possible to determine one bit of information about the key
∑ i∈I k i
with the help of Algorithm 1 [8] . The core idea of Algorithm 1 is a maximum-likelihood.
       Algorithm 1: Determination of key information. T : = # of plaintexts (out of N) such that the left side of (1) is equal to 0. IF T > N/2 THEN guess k i =0 (when p > 1/2) or 1 (otherwise) ELSE guess k i =1 (when p > 1/2) or 0 (otherwise) END
The computational formula of the success rate to recover the key
∑ i∈I k i
is as follows [8] and is related to both the data complexity N (the number of plaintext/ciphertext pairs) and the bias ε :
γ= 1 2π 2 N ε    e x 2 /2  dx .
The main goal of linear cryptanalysis is to find an effective linear approximation. However, thus far, there has been no optimal algorithm to look for such a linear approximation for any cryptosystem. In this paper, we tentatively put forward a novel technique called cube–linear attack, contributing to linear cryptanalysis.
III. Improved Linear Cryptanalysis
- 1. Cube–Linear Attack
This subsection provides the basic theory of our cube–linear attack. Generally speaking, linear cryptanalysis exploits specific correlations between the input and output of cryptographic primitives. For almost any cryptographic scheme, each output bit can be described by a multivariate master polynomial over GF(2) in the public variables and the secret variables. When launching linear attacks, we can, in specific scenarios, obtain the explicit description of a multivariate polynomial zi = p ( v , k ) with probability p *. Based on cube attacks, the polynomial zi = p ( v , k ) is easily split into the form p ( x 1 , … , xn ) ≡ tI · P S(I) + q ( x 1 , … , xn ) for any term tI . We can determine the maxterm tI leading to a linear expression P S(I) , and then the secret variables in the resultant system of polynomial equations can be efficiently solved. Since the polynomial’s probability, p *, is less than one after one or several approximations in linear attacks and cube attacks are only used on such polynomials when their probability is equal to one, the actual use of cube attacks in this case means that they first need to be adapted prior to their use. Here, we combine cube attacks with linear attacks to propose a novel method called a cube–linear attack, aiming to recover the secret variables of a probabilistic multivariate polynomial. Based on the aforementioned different ways of combining cube attacks and linear attacks, two schemes, A and B, are discussed under the following attack condition: all derived polynomials defined by any determined maxterm are independent.
A. Description of Scheme A
Scheme A can be considered as the “cube–linear–cube” attack, the name befittingly corresponding to the three steps that it incorporates. The following analysis embodies a more detailed introduction of scheme A. Based on the split polynomial p ( v , k ) ≡ tI · P S(I) + q ( v , k ), as mentioned above, we can easily determine the maxterm leading to an expression P S(I) of which the degree d ( P S(I) ) is one. Correspondingly, I and CI are also distinct. Running all the possible values of CI , any vector v CI can define a derived polynomial p |v with probability
p v *
(
p v *
stands for the probability of p |v when CI takes the value v on condition that the polynomial zi = p ( v , k ) holds with probability p *). Let us denote K |v the XOR of all the monomials involving only the key information in the derived polynomial p |v . Then K |v can be determined by Algorithm 1 since the derived polynomial p |v is a linear expression when considering K |v as a single variable. Similar to cube attacks, we can obtain a new linear equation when summing all the recovered K |v .
        Scheme A Step 1. Determine the maxterm tI. Step 2. Recover K|v in each derived polynomial p|v by Algorithm 1. Step 3. Sum all the recovered K|v, and then the value v C I K |v is known.
Theorem 2. The data complexity of scheme A is
N A = ∑ v=0 2 s −1 N v =O( ∑ v=0 2 s −1 ( | p v * −1/2 | ) −2 )
, and the success rate of scheme A is
γ A =1/2 + 2 2 s −1 ∏ v=0 2 s −1 ( γ v −1/2 )
, where s is the size of index subset I ; Nv and γv are respectively referred to as the data complexity and the success rate to recover K |v using Algorithm 1.
Proof. Since the probability of the derived polynomial p |v is
p v *
, the data complexity to recover K |v is easily calculated as
N v =O( | p v * −1/2 | −2 )
with a confident success rate γv according to Algorithm 1. Consequently, the total data complexity of scheme A is
N A  =  ∑ v=0 2 s −1 N v  =  O( ∑ v=0 2 s −1 ( | p v *  − 1/2 | ) −2 )
.
Let
γ v *
be the failure rate to recover K |v , then
γ v * =1− γ v
. For simplicity, we may as well denote “0” and “1” as the “right” and “wrong” values of K |v , respectively; that is,
p( K |v =0 )= γ v
and
p( K |v =1 )= γ v *
, respectively. Based on the aforementioned attack condition, we deduce that all the recovered K |v are independent of each other. Therefore, the success rate of scheme A is
γ A =p( ⊕ v=0 2 s −1 K |v =0 )= 1/2 + 2 2 s −1 ∏ v=0 2 s −1 ( γ v −1/2 )
by Lemma 1.                        ■
B. Description of Scheme B
       Scheme B Step 1. Determine the maxterm tI. Step 2. Sum all the derived polynomials p|v, then we have P S(I) = Δ v C I p |v . Step 3. Recover the key information in PS(I) by Algorithm 1.
Corresponding to the above three steps, the first two steps of scheme B are actually the use of the cube attack, and then the key information in P S(I) can be determined with the help of Algorithm 1.
Theorem 3. The data complexity of scheme B is
N B =O( ( 2 2 s −1 ∏ v=0 2 s −1 | p v * −1/2 | ) −2 )
with a confident success rate γ B .
Proof. Under the aforementioned attack condition, all the derived polynomials p |v defined by v CI hold independently with probability
p v *
. So, the bias that
P S(I)  =  ∑ v∈ C I p |v
holds is calculated as
2 2 s −1 ∏ v=0 2 s −1 | p v *  − 1/2 |
due to Lemma 1. By Algorithm 1, the data complexity N B to recover the key information in P S(I) with a confident success rate γ B is
O( ( 2 2 s 1 v=0 2 s 1 | p v * 1/2 | ) 2 ).                         ■
C. Comments on Schemes A and B
It should be noted that the recovered key information using scheme A is the same as that of scheme B.
Theorem 4. When the probability of each derived polynomial
p v *
satisfies
| p v * −1/2 |= 2 −1.5
,
N A ≤ 2 s 2 2 s −1 N B
(equality holds iff
| p v * −1/2 |= 2 −1.5
), where s is the size of index subset I ; and N A and N B represent respectively the data complexity of scheme A and scheme B.
Proof. According to Theorem 2 and Theorem 3,
N A ​= O( ∑ v=0 2 s −1 ( | p v * ​− 1/2 | ) −2 )
and
N B =O( ( 2 2 s −1 ∏ v=0 2 s −1 | p v * −1/2 | ) −2 )
. Let
x v  = ( 2⋅| p v * −1/2 | ) −2 ​, x v ​∈(1,+∞)
, then
N A  =4 ∑ v=0 2 s −1 x v
and
N B =4⋅ ∏ v=0 2 s −1 x v
. If N A = N B and all xv share the same value, then it’s deduced that
x v = 2 s/ ( 2 s −1)
.
When
x v ≥ 2 s/ ( 2 s −1)
, we have the following:
v=0 2 s 1 x v v=0 2 s 1 x v = x 0 + x 1 +    + x 2 s 1 x 0 x 1      x 2 s 1                     = 1 x 1 x 2      x 2 s 1 +    + 1 x 0 x 1      x i1 x i+1      x 2 s 1 +                              + 1 x 0 x 1      x 2 s 2                      1 2 s +    + 1 2 s +    + 1 2 s = 1 2 s 2 s =1.
Specifically, when xv ≥ 2; that is,
| p v * −1/2 |≤ 2 −1.5
, we have
v=0 2 s 1 x v v=0 2 s 1 x v = x 0 + x 1 +    + x 2 s 1 x 0 x 1      x 2 s 1                     = 1 x 1 x 2      x 2 s 1 +    + 1 x 0 x 1      x i1 x i+1      x 2 s 1 +                             + 1 x 0 x 1      x 2 s 2                      1 2 2 s 1 +    + 1 2 2 s 1 +    + 1 2 2 s 1 = 2 s 2 2 s 1 .
That is,
N A ≤ 2 s 2 2 s −1 N B
.                        ■
We have proved that
N A ≤ 2 s 2 2 s −1 N B ≤ N B  
only when the probability
p v *
of each derived polynomial satisfies
| p v * −1/2 |= 2 −1.5
. The results haven’t yet been proved for the remaining scenarios. With regard to the success rate of the two schemes,
γ A =1/2 + 2 2 s −1 ∏ v=0 2 s −1 ( γ v −1/2 )
— where r B is equal to r v depending on the relationship between the data complexity and the bias. If no more plaintext/ciphertext pairs exist, then the success rate of scheme A would be lower than that of scheme B.
Table 1 provides some comparisons between schemes A and B at the same level of success rate. It shows that scheme A is in fact more superior in most scenarios because of the following reasons: First of all, since the probability of the derived polynomial
p v *
in most cases satisfies
| p v * −1/2 |≪ 2 −1.5
, it means that N A N B according to our calculations. Secondly, the success rate r A doesn’t rapidly drop due to
γ A =1/2 + 2 2 s −1 ∏ v=0 2 s −1 ( γ v −1/2 )
. Furthermore, scheme A could actually be enhanced by increasing a few plaintext/ciphertext pairs based on the theory of linear cryptanalysis. Taken together, when to choose scheme A or scheme B should depend on the particular situation.
Comparison between schemes A and B.
2−1.3 2−6 2−9 Success rate
NA NB NA NB NA NB
1 24 23.2 213.4 222 219.4 234 97.8%
2 25.3 24.4 214.7 242 220.7 266
3 26.5 26.8 215.9 282 221.9 2130
Remark 1. For both schemes, the first thing we need to do in practice after determining the maxterm tI is to verify whether the attack condition comes into existence or not. If it is tenable, then we have the above theorems. Otherwise, the situations are too complicated to obtain any concrete results for the data complexity and success rate of the two schemes using the present theory.
- 2. Improved Linear Cryptanalysis
For a probabilistic polynomial, it is the secret variables recovered by our cube–linear attack that have never before been exposed by previous linear cryptanalysis. Consequently, it’s confirmed that the cube–linear attack indeed provides us with a paradise of improvement in linear cryptanalysis. Overall, the improved linear cryptanalysis could be summarized as follows.
        Improved linear cryptanalysis. When obtaining a polynomial zi = p(v, k) with probability p*: Step 1. Verify whether the attack condition comes into existence or not, if it is tenable, then go to the next step; Otherwise, our theory won’t work. Step 2. Based on Theorems 2, 3, and 4, choose a better scheme (scheme A or B) to recover the key information. Step 3. Combining the recovered key information, carry on looking for its linear approximation.
As an application, we cryptanalyze the security of the eSTREAM finalist, Trivium, against linear cryptanalysis using our method in the next section.
IV. Improved Linear Cryptanalysis on Trivium
- 1. Trivium Stream Cipher
Trivium [17] , a hardware-oriented stream cipher, was designed by C. De Canniѐre and B. Preneel and was selected for the final eSTREAM portfolio [18] . It takes an 80-bit key K and an 80-bit initial value (IV) as input. The internal state consists of 288 bits, which are aligned in three non-linear feedback shift registers of lengths 93, 84, and 111. It is claimed to be suitable to generate up to 2 64 bits of keystream from a pair of key and IV. They are initialized as follows:
( s 1 ,   s 2 ,    ,   s 93 )( k 1 ,    ,   k 80 ,  0,    ,  0), ( s 94 ,    ,   s 177 )( iv 1 ,    ,   iv 80 ,  0,    ,0), ( s 178 ,    ,   s 288 )(0,    ,  0,  1,  1,  1) .
The state is then updated iteratively by the following round transformation:
t 1 s 66 + s 93 , t 2 s 162 + s 177 , t 3 s 243 + s 288 , z t 1 +    t 2 +   t 3 , t 1 s 66 + s 91 s 92 + s 93 + s 171 , t 2 s 162 + s 175 s 176 + s 177 + s 264 , t 3 s 243 + s 286 s 287 + s 288 + s 69 , ( s 1 ,   s 2 ,  ...  ,   s 93 )( t 3 ,   s 1 ,  ...  , s 92 ), ( s 94 ,   s 95 ,  ...  ,   s 177 )( t 1 ,   s 94 ,  ...  ,   s 176 ), ( s 178 ,   s 179 ,  ...  ,   s 288 )( t 2 ,   s 178 ,  ...  ,   s 287 ).
No output is produced during the first 1,152 rounds. After this initialization phase, the value of z is output as the key stream at each round.
- 2. Description of Attack Process
For the 288-round Trivium, we have
z 1 = s 288 (66)+ s 288 (93)+ s 288 (162)          + s 288 (177)+ s 288 (243)+ s 288 (288),
where st ( i ) is the i th internal state bit at time t . The output z 1 is the sum of bits s 288 (66), s 288 (93), s 288 (162), s 288 (177), s 288 (243), and s 288 (288). The ANF of z 1 is found exhaustively in terms of the internal state bit at t = 144 as [5]
z 1 = s 144 (6)+ s 144 (16) s 144 (17)+ s 144 (31) s 144 (32)          + s 144 (33)+ s 144 (57)+ s 144 (82) s 144 (83)+ s 144 (84)          + s 144 (96)+ s 144 (97) s 144 (98)+ s 144 (99)+ s 144 (111)          + s 144 (129)+ s 144 (142) s 144 (143)+ s 144 (144)+ s 144 (150)          + s 144 (162)+ s 144 (163) s 144 (164)+ s 144 (165)+ s 144 (186)          + s 144 (192)+ s 144 (208) s 144 (209)+ s 144 (210)+ s 144 (231)          + s 144 (235) s 144 (236)+ s 144 (237)+ s 144 (252),
and its closest linear approximation [5] is
z 1 = s 144 (6)+ s 144 (33)+ s 144 (57)+ s 144 (84)          + s 144 (96)+ s 144 (99)+ s 144 (111)+ s 144 (129)          + s 144 (144)+ s 144 (150)+ s 144 (162)+ s 144 (165)          + s 144 (186)+ s 144 (192)+ s 144 (210)+ s 144 (231)          + s 144 (237)+ s 144 (252)
with bias 2 −9 .
Since the purpose of the attack is to find a linear approximation in the key, IV and output bits, the linear approximation given above is to be rewritten in terms of s 0 ( i ), i = 1, 2, … , 80, 94, 95, … , 173. The remaining terms are omitted since they are assigned to constants during the initialization phrase. Now, we have the right equation in (7) (see Appendix A), whereas it should be noted that there is something wrong in the one given by M.S. Turan and O. Kara [5] , which would affect the subsequent results.
Next, we show that there are still some improvements for our purpose when using a cube–linear attack. For the sake of analysis, we denote K monomials and IV monomials the XOR of monomials involving only the key bits and the XOR of monomials involving only the IV bits, respectively and set the XOR of the remaining monomials as (IV· K ) monomials . Then, (7) could be written as follows:
z 1 = K monomials + IV monomials + ( IVK ) monomials      = K monomials + IV monomials + iv 25 k 14 + iv 25 k 41 + iv 25 k 39 k 40          + iv 26 k 13 + iv 26 k 40 + iv 26 k 38 k 39 + iv 31 k 20 + iv 31 k 47          + iv 31 k 45 k 46 + iv 32 k 19 + iv 32 k 46 + iv 32 k 44 k 45 + iv 49 k 38          + iv 49 k 65 + iv 49 k 63 k 64 + iv 50 k 37 + iv 50 k 64 + iv 50 k 62 k 63          + iv 70 k 59 + iv 71 k 58 + iv 76 k 65 + iv 77 k 64 .
Observing (8), it is very easy to split it into the form p ( x 1 , … , xn ) = tI · P S(I) + q ( x 1 , … , xn ) and to determine whether there is such an index subset I that leads to a linear expression P S(I) based on the basic idea of cube attacks. Actually, the four index subsets {70}, {71}, {76}, and {77} in (8) are available for recovering k 58 , k 59 , k 64 , and k 65 , respectively. Taking the index subset I = {77} as a case in point, we explain how to recover k 64 using our method.
Step 1. Determine the index subset I = {77} and CI = {iv 77 } accordingly.
Step 2. According to cube attacks, the other public variables could be assigned any values except for those variables belonging to CI . For simplicity, we set iv 25 , iv 26 , iv 31 , iv 32 , iv 49 , and iv 50 to zero. When iv 77 = 0 (that is, iv 25 , iv 26 , iv 31 , iv 32 , iv 49 , iv 50 , iv 70 , iv 71 , iv 76 , and iv 77 all equal zero), we get the following:
K monomials =z ,
which holds with a bias of 2 −9 , where z stands for the XOR of all the known variables in (8) after assigning IV. When iv 77 = 1 (that is, iv 25 , iv 26 , iv 31 , iv 32 , iv 49 , iv 50 , iv 70 , iv 71 , iv 76 are all equal zero, and iv 77 equals one), we get the following:
k 64 + K monomials = z  ,
which holds with a bias of 2 −9 , where z ′ stands for the XOR of all the known variables in (8) after assigning IV.
Note that (9) and (10) are independent of each other since the key bit k 64 could be considered as a random variable. According to Theorem 4 and Table 1 , here, we can achieve a lower data complexity using scheme A compared to scheme B at the same level of success rate. Therefore, we can respectively recover K monomials and k 64 + K monomials with a success rate of 99.77% and data complexity of 2 19 IVs according to scheme A.
Step 3. Based on the above analysis, k 64 = ( k 64 + K monomials ) + ( K monomials ). The success rate to recover k 64 is 99.54% requiring 2 20 IVs. Here, the total success rate 99.54% is approximately calculated as 0.9977 2 since the failure rate (1−0.9977) is so small that (1−0.9977) 2 can be ignored according to Theorem 2.
Similarly, k 58 , k 59 , and k 65 can be recovered correspondingly for the other index subsets {70}, {71}, and {76}. Therefore, we can simultaneously recover the four key bits k 58 , k 59 , k 64 , and k 65 with success rate 98.17% (0.9977 8 ) and 2 21.32 IVs (5 × 2 19 ≈ 2 21.32 ). The above results are given in Table 2 .
Key information recovered.
(iv25, iv26, iv31, iv32, iv49,iv50, iv70, iv71, iv76, iv77) Key Bias Data complexity Success rate (%)
(0, 0, 0, 0, 0,0, 0, 0, 0, 0) Kmonimials 2−9 219 99.77
(0, 0, 0, 0, 0,0, 1, 0, 0, 0) Kmonimials+k59 2−9 219 99.77
(0, 0, 0, 0, 0,0, 0, 1, 0, 0) Kmonimials+k58 2−9 219 99.77
(0, 0, 0, 0, 0,0, 0, 0, 1, 0) Kmonimials+k65 2−9 219 99.77
(0, 0, 0, 0, 0,0, 0, 0, 0, 1) Kmonimials+k64 2−9 219 99.77
- 3. Search for Better Linear Approximations
The above attack provides us with a paradise of improvement in the search for better linear approximations. Since we aren’t aware of the actual values of the four key bits recovered and they are dependent upon concrete scenarios, note that all 16 possible values of the four bits now have to be considered. In the following article, we take ( k 58 , k 59 , k 64 , k 65 ) = (1, 1, 1, 1) as an example to illustrate how to look for linear approximations with bigger bias.
Returning ( k 58 , k 59 , k 64 , k 65 ) = (1, 1, 1, 1) to (7), we have equation (11) (see Appendix A). Observing (11), there are 34 linear, 48 quadratic, and 18 cubic terms. The linear approximation to (11) could be naturally obtained as
z 1  = 1+ k 3 + k 6 + k 15 + k 21 + k 27 + k 30 +   k 37 + k 38           + k 39 + k 54 + k 57 + k 63 + k 67 + k 68 + k 69 + k 72 + iv 3           + iv 6 + iv 21 + iv 24 + iv 30 + iv 33 + iv 39 + iv 45 + iv 49           + iv 50 + iv 51 + iv 70 + iv 71 + iv 72 + iv 76 + iv 77 + iv 78
with bias 2 65 ·(0.25) 48 ·(0.375) 18 = 2 −56.56 , assuming all nonlinear terms are independent.
According to (5)–(12), the bias that (12) holds could be calculated as
ε=2⋅ 2 −9 ⋅ 2 −56.56 = 2 −64.56 < 2 −| IV | /2 = 2 −40
(|IV| is referred to as the size of IV, and in particular, it is 80 bits for Trivium). However, the bias is too small to be used to recover any information about the key. According to the method used in [5] , the magnitude of the bias can be increased when allowed to choose some special key bits and IV bits. Unfortunately, the rules of selection weren’t provided, and furthermore, the selected bits in [5] weren’t also the best. So, here, Algorithm 2 (see Appendix B) is proposed to choose those special key bits or IV bits.
Denote and Ω K = { ki | ki = 0, 1 ≤ i ≤ 80} and Ω IV = {iv i | iv i = 0, 1 ≤ i ≤ 80} as the key bit set chosen to zero and the IV bit set chosen to zero, respectively. Then, |Ω K | and |Ω IV | represent the size of Ω K and Ω IV , respectively, and n 1 and n 2 are referred to as the individual quantity of the remaining quadratic and cubic terms, respectively, after Ω K and Ω IV have been determined. The bias of a linear approximation to (11) is calculated as follows:
ε=2 2 9 2 ( n 1 + n 2 )1 (0.25) n 1 (0.375) n 2 = (0.5) n 1 +9 (0.75) n 2 .
When given |Ω K | and |Ω IV |, we can make use of Algorithm 2 to choose Ω K and Ω IV . Algorithm 2 takes finding better linear approximations as a criterion, and the main ideas are to select such a bit that can eliminate the most nonlinear monomials when being set to zero. More detailed rules to cope with more complicated scenarios are showed in Algorithm 2. When |Ω K | = 10 and |Ω IV | = 10, for instance, Table 3 provides the two sets Ω K and Ω IV using Algorithm 2.
(ΩK, ΩIV) and corresponding biasε.
(ΩIV, |ΩIV| = 10) (ΩK, |ΩK| = 10) Bias
iv10 iv13 iv25 iv31 iv34iv37 iv40 iv50 iv54 iv55 k13 k19 k23 k38k40 k45 k46 k50 k63 k5 2−23
k67 2−23
k68 2−23
According to the different Ω K of Table 3 , three linear approximations with the same bias 2 −23 are found. These are
z 1  =  1+ k 3 + k 6 + k 15 + k 21 + k 27 + k 30 +   k 37 + k 38 + k 39            + k 54 + k 57 + k 63 + k 67 + k 68 + k 69 + k 72 + iv 3 + iv 6            + iv 21 + iv 24 + iv 30 + iv 33 + iv 39 + iv 45 + iv 49 + iv 50            + iv 51 + iv 70 + iv 71 + iv 72 + iv 76 + iv 77 + iv 78 ,
z 1  =  1+ k 3 + k 6 + k 15 + k 21 + k 27 + k 30 + k 37 + k 38 + k 39           + k 54 + k 57 + k 63 + k 68 + k 69 + k 72 + iv 3 + iv 6 + iv 21           + iv 24 + iv 30 + iv 33 + iv 39 + iv 45 + iv 49 + iv 50 + iv 51           + iv 70 + iv 71 + iv 72 + iv 76 + iv 77 + iv 78 ,
and
z 1  =  1+ k 3 + k 6 + k 15 + k 21 + k 27 + k 30 + k 37 + k 38 + k 39            + k 54 + k 57 + k 63 + k 67 + k 69 + k 72 + iv 3 + iv 6 + iv 21            + iv 24 + iv 30 + iv 33 + iv 39 + iv 45 + iv 49 + iv 50 + iv 51            + iv 70 + iv 71 + iv 72 + iv 76 + iv 77 + iv 78 .
Similarly, when ( k 58 , k 59 , k 64 , k 65 ) takes the other 15 possible values, linear approximations can also be found with the same method. The corresponding results are concluded in Table 4 .
(k58k59k63k65) and corresponding (ΩK, ΩIV)
(k58 k59 k63 k65) |ΩIV| = 10 |ΩK| = 10 Bias
(0, 0, 0, 0)(0, 1, 0, 0)(1, 0, 0, 0)(1, 1, 0, 0) iv10 iv13 iv25 iv31 iv34iv37 iv40 iv50 iv54 iv55 k13 k19 k23 k38k39 k40 k45 k46 k50 k5 2−23
k67 2−23
k68 2−23
(0, 0, 1, 1)(0, 1, 1, 1)(1, 0, 1, 1)(1, 1, 1, 1) iv10 iv13 iv25 iv31 iv34iv37 iv40 iv50 iv54 iv55 k13 k19 k23 k38k40 k45 k46 k50 k63 k5 2−23
k67 2−23
k68 2−23
(0, 0, 0, 1)(0, 1, 0, 1)(1, 0, 0, 1)(1, 1, 0, 1) iv10 iv13 iv25 iv31 iv34iv37 iv40 iv50 iv54 iv55 k13 k19 k38 k39k40 k45 k46 k50 k62 k5 2−23
k67 2−23
k68 2−23
(0, 0, 1, 0)(0, 1, 1, 0)(1, 0, 1, 0)(1, 1, 1, 0) iv10 iv13 iv25 iv31 iv34iv37 iv40 iv50 iv54 iv55 k13 k19 k38 k39k40 k45 k46 k50 k63 k5 2−24
k67 2−24
k68 2−24
- 4. Comparison with Previous Results
For the reduced version of 288-round Trivium, M.S. Turan and O. Kara in [5] found a linear approximation with bias of 2 −31 when |Ω K | = 10 and |Ω IV | = 10. In [19] , Jia and others presented a multiple linear cryptanalysis on 288-round Trivium, resulting in another linear approximation with the same bias 2 −31 when |Ω K | = 10 and |Ω IV | = 13. In this paper, we find some linear approximations with bigger bias. Moreover, the additional four key bits are recovered. These results are summarized in Table 5 . Obviously, our attack is better than the previous.
Comparison with previous results.
(|ΩK|, |ΩIV|) Bias Num. Data complexity Success rate (%)
Turan [5] (10, 10) 2−31 1 262 97.8
Ours (10, 10) 2−23.2 3 247.2 97.8
Jia [19] (10, 13) 2−31 2 261 97.8
Ours (10, 13) 2−20.2 3 241.2 97.8
V. Conclusion
In cryptography, linear cryptanalysis is the most basic cryptanalysis approach aiming to look for a linear relation between inputs and outputs. A variety of refinements to the attack have been suggested in the past. In this paper, a novel technique called cube–linear attack is proposed (for the first time) to take a probabilistic polynomial as the attack target and furthermore to mine the secret information that has yet to be exploited by previous linear cryptanalysis methods. It is beneficial to allow for a reduction in the amount of data required for a successful attack in specific circumstances. Applying our method to a specific analysis on Trivium, we get better linear cryptanalysis results. Although a few better cryptanalytic results on Trivium had been published earlier using other attacks, we believe that our method is meaningful from the point of view of improving linear cryptanalysis; furthermore, it could be extended to other ciphers. For example, we have improved the bias of the revised 288-round Trivium algorithms proposed in [20] and [21] by 2 14 and 2 4 , respectively. Therefore, our method is worth considering in launching linear attacks on a cryptosystem.
Appendix A
The two equations (7) and (11) appearing in this paper are as follows:
z 1 =1+ k 3 + k 6 + k 15 + k 21 + k 27 + k 30 + k 39 + k 54 + k 57 + k 67 + k 68         + k 69 + k 72 + iv 3 + iv 6 + iv 21 + iv 24 + iv 30 + iv 33 + iv 39 + iv 45 + iv 51         + iv 72 + iv 78 + k 4 k 5 + k 13 k 14 + k 13 k 41   + k 14 k 40 + k 16 k 17         + k 19 k 20 + k 19 k 47 + k 22 k 23 + k 20 k 46 + k 25 k 26 + k 28 k 29 + k 34 k 35         + k 37 k 38 + k 37 k 65 + k 38 k 64 + k 39 k 40 + k 43 k 44 + k 45 k 46         + k 49 k 50 + k 52 k 53 + k 58 k 59 + k 61 k 62 + k 63 k 64 + iv 77 k 64         + k 64 k 65 + k 67 k 68 + k 70 k 71 + iv 10 iv 11 + iv 13 iv 14 + iv 25 iv 26         + iv 31 iv 32 + iv 34 iv 35 + iv 37 iv 38 + iv 40 iv 56 + iv 41 iv 55 + iv 49 iv 50            + iv 54 iv 55 + iv 58 iv 59 + iv 61 iv 62 + iv 67 iv 68 + iv 70 iv 71         + iv 76 k 65 + iv 73 iv 74 + k 13 k 39 k 40 + k 14 k 38 k 39 + k 19 k 45 k 46         + k 20 k 44 k 45 +   k 37 k 63 k 64 + iv 70 k 59 + k 38 k 39 k 40 + k 38 k 39 k 41         + k 38 k 62 k 63 + k 44 k 45 k 46 + k 44 k 45 k 47 + k 62 k 63 k 64         + iv 71 k 58 + k 62 k 63 k 65 + iv 40 iv 54 iv 55 + iv 41 iv 53 iv 54         + iv 53 iv 54 iv 55 + iv 53 iv 54 iv 56 + iv 25 k 14 + iv 25 k 41 + iv 25 k 39 k 40         + iv 26 k 13 + iv 26 k 40 + iv 26 k 38 k 39 + iv 31 k 20 + iv 31 k 47 + iv 31 k 45 k 46         + iv 32 k 46 + iv 32 k 19 + iv 32 k 44 k 45 + iv 49 k 38 + iv 49 k 65 + iv 49 k 63 k 64         + iv 50 k 37 + iv 50 k 64 + iv 50 k 62 k 63 .
z 1 =1+ k 3 + k 6 + k 15 + k 21 + k 27 + k 30 + k 37 + k 38 + k 39 + k 54         + k 57 + k 63 + k 67 + k 68 + k 69 + k 72 + iv 3 + iv 6 + iv 21 + iv 24         + iv 30 + iv 33 + iv 39 + iv 45 + iv 49 + iv 50 + iv 51 + iv 70 + iv 71         + iv 72 + iv 76 + iv 77 + iv 78 + k 4 k 5 + k 13 k 14 + k 13 k 41         + k 14 k 40 + k 16 k 17 + k 19 k 20 + k 19 k 47 + k 22 k 23 + k 20 k 46         + k 25 k 26 + k 28 k 29 + k 34 k 35 + k 37 k 38 + k 37 k 63 + k 39 k 40         + k 43 k 44 + k 45 k 46 + k 49 k 50 + k 52 k 53 + k 61 k 62 + k 67 k 68         + k 70 k 71 + iv 10 iv 11 + k 70 k 71 + iv 10 iv 11 + iv 13 iv 14 + iv 25 iv 26         + iv 31 iv 32 + iv 34 iv 35 + iv 37 iv 38 + iv 40 iv 56 + iv 41 iv 55         + iv 49 iv 50 + iv 54 iv 55 + iv 58 iv 59 + iv 61 iv 62 + iv 67 iv 68         + iv 70 iv 71 + iv 73 iv 74 + k 13 k 39 k 40 + k 14 k 38 k 39 + k 19 k 45 k 46         + k 20 k 44 k 45 + k 38 k 39 k 40 + k 38 k 39 k 41 + k 38 k 62 k 63         + k 44 k 45 k 46 + k 44 k 45 k 47 + iv 40 iv 54 iv 55 + iv 41 iv 53 iv 54         + iv 53 iv 54 iv 55 + iv 53 iv 54 iv 56 + iv 25 k 14 + iv 25 k 41         + iv 25 k 39 k 40 + iv 26 k 13 + iv 26 k 40 + iv 26 k 38 k 39 + iv 31 k 20         + iv 31 k 47 + iv 31 k 45 k 46 + iv 32 k 46 + iv 32 k 19 + iv 32 k 44 k 45         + iv 49 k 38 + iv 49 k 63 + iv 50 k 37 + iv 50 k 62 k 63 .
Appendix B
        Algorithm 2. Algorithm to choose special key and IV bits. Reset ΩK and ΩIV; Input : nK and nIV (sizes of ΩK and ΩIV to choose respectively) Step 1: Reset Ω and Π, count the frequency of each bit in all nonlinear monomials, put the bits with the highest frequency in Ω; Step 2:      1. When |Ω| = 1, determine the type of the bit in Ω, if it is key bit, then put it to ΩK, otherwise ΩIV;      2. When |Ω| ≥ 2, count the frequency of quadratic term of each bit in Ω, put the bits with the highest frequency in Π:            2.2.1 If |Π| = 1, determine type of the bit in Π, if it is key bit, then put it to ΩK, otherwise ΩIV;            2.2.2 If |Π| ≥ 2, choose arbitrarily one bit in Π, and determine type, if it is key bit, put it in ΩK, otherwise ΩIV; Step 3: Calculate the polynomial again, based on chosen ΩK and ΩIV; Step 4:              if ( (| Ω K |< n K ) && (| Ω IV |< n IV ) ) Return to Step 1;               else if ( (| Ω K |= n K ) && (| Ω IV | n IV ) )                      Stop searching the key bits, return to Step 1, and continue to search the IV bits;               else if ( (| Ω K | n K ) && (| Ω IV | n IV ) )                      Stop searching the IV bits, return to Step 1, and continue to search the key bits;                 else if ( (| Ω K |= n K ) && (| Ω IV |= n IV ) )                      Output: ( ( Ω K ,   Ω IV ),ε ) End
This work was supported by the National Natural Science Foundation of China (Grant No. 61202491, 61272041, and 61272488).
BIO
Corresponding Author  swl_cipher@163.com
Wen-Long Sun received his BE and MSc degrees in cryptology from the Information Science and Technology Institute, Zhengzhou, China, in 2011 and 2014, respectively. Now, he is an assistant engineer at the Beijing Satellite Navigation Center, Beijing, China. His main research interests include cryptology and information security.
guanjie007@163.com
Jie Guan received her PhD degree in cryptology from the Information Science and Technology Institute, Zhengzhou, China, in 2004. She is currently a professor at the Information Science and Technology Institute. Her main subject interest is cryptology, and she teaches information systems, the theory of cryptography, and quantum computation.
References
Golić J.D. “Linear Cryptanalysis of Stream Ciphers,” Fast Softw. Encryption: Int. Workshop Leuven, Belgium Dec. 14–16, 1994 154 - 169    DOI : 10.1007/3-540-60590-8_13
Golić J.D. , Bagini V. , Morgari G. “Linear Cryptanalysis of Bluetooth Stream Cipher,” EUROCRYPT Amsterdam, Netherlands Apr. 28–May 2, 2002 238 - 255    DOI : 10.1007/3-540-46035-7_16
Muller F. , Peyrin T. “Linear Cryptanalysis of the TSC Family of Stream Ciphers,” ASIACRYPT Chennai, India Dec. 4–8, 2005 373 - 394    DOI : 10.1007/11593447_20
Khazaei S. , Hassanzadeh M. 2006 ECRYPT Stream Cipher Project EU
Turan M.S. , Kara O. “Linear Approximations for 2-Round Trivium,” Int. Conf. Security Inf. Netw. Gazimagusa, North Cyprus May 8–10, 2007 96 - 105
Matsui M. , Yamagishi A. “A New Method for Known Plaintext Attack of FEAL Cipher,” EUROCRYPT Balatonfüred, Hungary May 24–28, 1992 81 - 91    DOI : 10.1007/3-540-47555-9_7
Shimizu A. , Miyaguchi S. “Fast Data Encipherment Algorithm FEAL,” EUROCRYPT Apr. 13–15, 1987 267 - 278    DOI : 10.1007/3-540-39118-5_24
Matsui M. “Linear Cryptanalysis Method for DES Cipher,” EUROCRYPT Lofthus, Norway May 23–27, 1993 386 - 397    DOI : 10.1007/3-540-48285-7_33
Matsui M. “The First Experimental Cryptanalysis of the Data Encryption Standard,” CRYPTO Santa Barbara, CA, USA Aug. 21–25, 1994 1 - 11    DOI : 10.1007/3-540-48658-5_1
Kaliski B.S. , Robshaw M.J.B. “Linear Cryptanalysis Using Multiple Approximations,” CRYPTO Santa Barbara, CA, USA Aug. 21–25, 1994 26 - 39    DOI : 10.1007/3-540-48658-5_4
Langford S.K. , Hellman M.E. “Differential-Linear Cryptanalysis,” CRYPTO Santa Barbara, CA, USA Aug. 21–25, 1994 17 - 25    DOI : 10.1007/3-540-48658-5_3
Knudsen L.R. , Robshaw M.J.B. “Non-linear Approximations in Linear Cryptanalysis,” Saragossa, Spain May 12–16, 1996 EUROCRYPT 224 - 236    DOI : 10.1007/3-540-68339-9_20
Bogdanov A. , Rijmen V. 2011 Linear Hulls with Correlation Zero and Linear Cryptanalysis of Block Ciphers http://eprint.iacr.org/ 2011/123 Cryptology ePrint Archive
Dinur I. , Shamir A. “Cube Attacks on Tweakable Black Box Polynomials,” EUROCRYPT Cologne, Germany Apr. 26–30, 2009 278 - 299
Vielhaber M. 2007 Breaking ONE.FIVIUM by AIDA an Algebraic IV Differential Attack http://eprint.iacr.org/2007/413 Cryptology ePrint Archive
Fouque P.-A. , Vannet T. “Improving Key Recovery to 784 and 799 Rounds of Trivium Using Optimized Cube Attacks,” Fast Softw. Encryption: Int. Workshop Singapore Mar. 11–13, 2013 502 - 517    DOI : 10.1007/978-3-662-43933-3_26
De Cannière C. , Preneel B. “TRIVIUM: A Stream Cipher Construction Inspired by Block Cipher Design Principles,” Inf. Security Samos Island, Greece Aug. 30–Sept. 2, 2006 171 - 186    DOI : 10.1007/11836810_13
eSTREAM 2008 The ECRYPT Stream Cipher Project http://www.ecrypt.eu.org/stream
Jia Y. 2011 “Linear Cryptanalysis of 2-Round Trivium with Multiple Approximations,” J. Electron. Inf. Technol. 33 (1) 223 - 227    DOI : 10.3724/SP.J.1146.2010.00334
Afzal M. , Masood A. 2009 Modifications in the Design of Trivium to Increase its Security Level Cryptology ePrint Archive Report
Raj A.S. , Srinivasan C. “Analysis of Algebraic Attack on Trivium and Minute Modification to Trivium,” Conf. Netw. Security Appl. Chennai, India July 15–17, 2011 35 - 42    DOI : 10.1007/978-3-642-22540-6_4