Multiple differential-zero correlation linear cryptanalysis of reduced-round CAST-256

Massoud Hadian Dehkordi; Roghayeh Taghizadeh

doi:10.1515/jmc-2016-0054

Article Open Access

Multiple differential-zero correlation linear cryptanalysis of reduced-round CAST-256

Massoud Hadian Dehkordi and Roghayeh Taghizadeh

Published/Copyright: April 21, 2017

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Journal of Mathematical Cryptology Volume 11 Issue 2

Abstract

CAST-256 (or CAST6) is a symmetric-key block cipher published in June 1998. It was submitted as a candidate for Advanced Encryption Standard (AES). In this paper, we will propose a new chosen text attack, the multiple differential-zero correlation linear attack, to analyze the CAST-256 block cipher. Our attack is the best-known attack on CAST-256 according to the number of rounds without the weak-key assumption. We first construct a 30-round differential-zero correlation linear distinguisher. Based on the distinguisher, we propose a first 33-round attack on CAST-256 with data complexity of 2115.63 and time complexity 2238.26. In the end, the 111-bit subkey is recovering.

Keywords: CAST-256; symmetric-key block cipher; differential cryptanalysis; zero correlation linear attack; linear approximation

MSC 2010: 94A60

1 Introduction

CAST-256 (CAST6) is a symmetric-key block cipher invented in June 1998. It was submitted as a candidate for the Advanced Encryption Standard (AES); however, it was not among the five AES finalists. It is an extension of an earlier cipher, CAST-128; both were designed according to the “CAST”.

CAST-256 is an extension of the CAST-128 cipher and uses the same elements as CAST-128, including S-boxes, but is adapted for a block size of 128 bits – twice the size of its 64-bit predecessor. CAST-256 is composed of 48 rounds, sometimes described as 12 “quad-rounds”, arranged in a generalized Feistel network.

Differential cryptanalysis is usually a chosen plaintext attack applicable primarily to block ciphers. It was invented in 1990 by Biham and Shamir [2]. Linear cryptanalysis has been introduced by Matsui [10]. It is a known plaintext attack proposed in 1993 to break Data Encryption Standard (DES).

Linear and differential cryptanalysis are a basic tool to evaluate the security of block ciphers. Both cryptanalytic methods were applied to attack the block cipher DES faster than an exhaustive key search [3, 9]. Both of these attacks have been identified as effective techniques for breaking large-class symmetric cipher.

A differential linear attack is a combination of both linear and differential cryptanalysis. It has been introduced by Hellman and Langford [8] in 1994 and applied it to break 8-round DES. The attack uses a differential characteristic over part of the cipher with a probability of 1. The rounds immediately following the differential characteristic have a defined linear approximation, and we expect that for each chosen plaintext pair, the probability of the linear approximation holding for one chosen plaintext but not the other will be lower for the correct key. The attack was generalized by Biham, Dunkelman, and Keller [1] in 2002 to use differential characteristics with probability less than 1.

A zero-correlation linear attack is a novel promising key recovery technique for block ciphers developed by Bogdanov in [6, 7]. It is a novel extension of linear cryptanalysis and based on linear approximations with probability of exactly 12, which corresponds to the zero correlation. We propose a new cryptanalytic method called multiple differential-zero correlation linear attack, which combines differential and linear cryptanalysis with zero correlation.

The best cryptanalysis so far in the classical single-key model without the weak-key assumption has been a linear attack on 32 rounds. We find 30-round differential-zero correlation linear distinguisher for CAST-256 and attack 33 rounds of CAST-256 using multidimensional differential-zero correlation linear cryptanalysis. Our attack is the best-known attack on CAST-256 according to the number of rounds without the weak-key assumption.

In this paper, we will propose a new method, the multiple differential-zero correlation linear attack, to analyze the CAST-256 block cipher. By constructing a 30-round distinguisher, using the new method, we propose an attack on 33-round CAST-256 with data complexity of 2115.63 and time complexity of 2238.26. Table 1 summarizes and compares the attacks on CAST-256.

Our paper is organized as follows: Section 2 provides a brief description of CAST-256. Section 3 introduces our new method of multiple differential-zero correlation linear attack. In Section 4, we present details of the 30-round multiple differential-zero correlation linear distinguisher. The 33-round multiple differential-zero correlation linear attack on CAST-256 is discussed in detail in Section 5. We summarize our results in Section 6.

Table 1

Summary of attacks on CAST-256.

Attack	Rounds	Key size	Data	Time	Reference
Distinguishing	12	128	2101	2101	[11]
Boomerang	16	128	249.3	—	[12]
Linear	24	192	2124.1	2156.52	[13]
Multidimensional ZC	28	256	298.8	2246.9	[4]
Linear	32	256	2126.8	2251	[14]
Our attack	33	256	2115.63	2238.26	this paper

2 Description of CAST-256

CAST-256 is designed based on CAST-128. It is capable of using cryptographic keys of 128, 160, 192, or 256 bits to encrypt and decrypt data in blocks of 128 bits. Its S-boxes Si, 1≤i≤4, are non-surjective with 8- and 32-bit output. CAST-256 has 48 rounds for all key sizes, sometimes described as 12 “quad-rounds”, arranged in a generalized Feistel network with four branches and consists of six forward quad-rounds, six reverse quad-rounds, and three different round functions denoted by F1, F2, and F3, respectively. If I=(I1,I2,I3,I4) and O are the 32-bit input and output of the round function, kr and km are the 5-bit, rotation subkey, and the 32-bit masking subkey for the current round, we can describe F1, F2, and F3 as follows:

F1(I=((km+I)≪kr))=(S1(I1)⊕S2(I2)-S3(I3)+S4(I4)),

F2(I=((km⊕I)≪kr))=(S1(I1)-S2(I2)+S3(I3)⊕S4(I4)),

F3(I=((km-I)≪kr))=(S1(I1)+S2(I2)⊕S3(I3)-S4(I4)),

where +, -, ⊕, and ≪ are the addition and subtraction modulo 232, bit-wise exclusive-OR, and the left rotation, respectively.

Figure 1

Forward quad-round of CAST-256.

Figure 2

Design of CAST-256.

Let β=(A,B,C,D) be a 128-bit block of CAST-256, where A, B, C, and D are 32 bits each, for the inputs of different round functions Fi, 1≤i≤3.

The forward quad-round β:=Q⁢(β), as shown in Figure 1, is defined as follows:

C=C⊕F1⁢(D,kr⁢1i,km⁢1i),

B=B⊕F2⁢(C,kr⁢2i,km⁢2i),

A=A⊕F3⁢(B,kr⁢3i,km⁢3i),

D=D⊕F1⁢(D,kr⁢4i,km⁢4i).

The reverse quad-round β:=Q′⁢(β) is defined as follows:

D=D⊕F1⁢(A,kr⁢1i,km⁢1i),

A=A⊕F3⁢(B,kr⁢2i,km⁢2i),

B=B⊕F2⁢(C,kr⁢3i,km⁢3i),

C=C⊕F1⁢(D,kr⁢4i,km⁢4i).

Here kr⁢ji and km⁢ji (1≤j≤4, 1≤i≤12) are the rotation subkey and the masking subkey in the j-th round of the i-th quad-round, respectively. The design of CAST-256 illustrated in Figure 2.

3 Multiple differential-zero correlation linear attack

We propose a new cryptanalytic method, called multiple differential-zero correlation linear attack, which combines differential and multiple linear cryptanalysis with correlation exactly zero.

To define a differential-linear distinguisher, we need to treat the block cipher E (=E1∘E0) as a cascade of two sub-ciphers E0 and E1. If Δ⁢α→Δ⁢β is a (truncated) differential with probability 1 for E0 and Γ⁢γ→Γ⁢δ is a linear approximation with bias 0 for E1 where Δ⁢β.Γ⁢γ=0, then a differential-zero correlation linear distinguisher is defined to be a pair (Δ⁢α→Δ⁢β,Γ⁢γ→Γ⁢δ) consisting of a (truncated) differential and a linear approximation.

Let p and p* be two plaintexts satisfying p⊕p*=Δ⁢α. Since E0⁢(p)⊕E0⁢(p*)=Δ⁢β, we have

E0⁢(p).Γ⁢γ=E0⁢(p*).Γ⁢γ

with probability 1. The differential-zero correlation linear distinguisher is concerned with the event

Γ⁢δ.E⁢(p)⊕Γ⁢δ.E⁢(p*)=0 or Γ⁢δ.E⁢(p)=Γ⁢δ.E⁢(p*).

By the assumptions used in [8] we have

pr(δ.E(p)⊕δ.E(p*)=0)=12.

Let zi=〈ui⁢a〉+〈wi⁢b〉, i=1,…,m, be m linear equations, where a∈F2n is plaintext and b∈F2m is some part of data in the encryption process. Instead of considering each such bit and its distribution independently as a varies, we focus on the analysis of the distribution of the m-tuples z=(z1,…,zm).

We have the following relationship between the probability distribution of z and the correlations cγ of all linear equations where γ∈F2m:

Pr⁡[z]=∑γ∈F2m(-1)〈γ,z〉⁢cγ.

We assume that the correlations of all linear equations and their nonzero linear combinations are equal to zero. It follows that cγ=0 for all γ≠0. When substituting this information in the formula of Pr⁡[z], we determine that z has a uniform distribution in F2m.

We select N distinct (p,p*) for an n-bit block cipher E (=E1∘E0), where Δ⁢α→Δ⁢β is a (truncated) differential with probability 1 for E0 and m linear approximations Γ⁢γ→Γ⁢δ with bias zero for E1 such that Δ⁢β.Γ⁢γ=0 and all their nonzero linear combinations have correlation zero. We compute

zi=Γ⁢δi.E⁢(p)⊕Γ⁢δi.E⁢(p*),i=1,…,m.

Then we can construct, as shown above, a function f:F2n→F2m whose outputs z=(z1,…,zm), computed for all chosen plaintexts, are uniformly distributed m-tuples of bits in F2m. Such a completely uniform distribution is very unlikely to have been obtained from selecting the values at random in F2m, even if the probability of each value is equal. Then we can distinguish the non-random behavior of the cipher data already with much less data than the full codebook (distribution of the cipher data follows a multivariate hypergeometric distribution, while the data drawn at random from a uniform distribution on F2m follows a multinomial distribution [5]).

A counter V⁢[z]=0 is initialized for each of the 2m data value z∈F2m. Then, for each distinct plaintext pairs (p,p*) we compute the corresponding data value in F2m (by evaluating the m basis linear approximations) and increment the counter V⁢[z] of this data value by one. Now we compute the statistic T for this distribution as

T=∑i=02l-1(V⁢[z]-N⁢2-m)2N⁢2-m⁢(1-2-m).

For sufficiently large sample size N and number l of zero-correlation linear approximations given for the cipher, the statistic T will have two distinct distributions:

For the cipher exhibiting zero-correlation, the statistic T follows a χ2-distribution with mean μ0 and variance σ02 as follows:
μ0=(l-1)⁢2n-N2n-1,σ02=2⁢(l-1)⁢(2n-N2n-1)2.
For a randomly drawn permutation which is our wrong-key, the statistic T follows a χ2-distribution with mean μ0=l-1 and variance σ02=2⁢(l-1)

The proof of this proposition is available in [4].

4 The 30-round differential-zero correlation linear distinguisher

In this section, we first present a 30-round differential-linear distinguisher, which consists of a 2-round differential characteristic with probability 1 followed by a 28-round linear approximation with correlation 0.

The 30-round differential-zero correlation linear distinguisher is made of a 28-round linear approximation Γ⁢γ→Γ⁢δ with correlation 0 for round 5 to 32 (four forward quad-rounds followed by three reverse quad-rounds, or rounds 5 to 32) and the 2-round truncated differentials Δ⁢α→Δ⁢β that meet Δ⁢β.Γ⁢γ=0, for round 3 to 4. If the input mask is (0,0,0,L) and the output mask is (0,0,0,L), then the correlation of the linear approximation for the 24-round CAST-256 is zero.

4.1 The 2-round differential characteristic

The 2-round truncated differential Δ⁢α→Δ⁢β with probability 1 is (0,0,α,0)→(0,0,α,0) as illustrated in Figure 3.

$Figure 3 Truncated differential (0,0,α,0)→(0,0,α,0){(0,0,\alpha,0)\to(0,0,\alpha,0)} with probability 1.$

Figure 3

Truncated differential (0,0,α,0)→(0,0,α,0) with probability 1.

4.2 The 28-round zero correlation linear characteristic

The construction of a 28-round linear characteristic is illustrated in Figure 4, which is from round 9 to round 36 (four forward quad-rounds followed by three reverse quad-rounds).

Figure 4

The 28-round zero correlation linear characteristic.

5 Key recovery attack on 33-round CAST-256

We use the 30-round differential zero-correlation linear approximations to attack 33 rounds of CAST-256.

The attack works as follows:

Choose λ structures Si, i=0,1,…,2λ-1, where a structure is defined to be a set of 264 plaintexts Pi,j with the 64 bits taking all the possible values and the other 64 bits fixed, j=0,1,…,264-1. In a chosen-plaintext attack scenario, obtain all the ciphertexts for the 264 plaintexts in each of the λ structures; we denote the ciphertext for plaintext Pi,j by Ci,j.
Allocate a 32-bit global counter V⁢[z] for each of 232 possible values of the 32-bit vector z and set it to 0. V⁢[z] will contain the number of times the vector value z occurs for the current key guess. The vector z is the evaluations of 32 basis zero-correlation masks.
Guess a value for (kr⁢11,km⁢11,kr⁢21,km⁢21) and do as follows:
1. Partially encrypt every plaintext Pi,j with the guessed (kr⁢11,km⁢11,kr⁢21,km⁢21) to get its intermediate value immediately after 2 rounds, and we denote it by εi,j.
2. Compute εi,j⊕(0,0,α,0), and we denote the resulting value by ε^i,j.
3. Partially decrypt ε^i,j with the guessed (kr⁢11,km⁢11,kr⁢21,km⁢21) to get its plaintext, and find the plaintext in Si, and we denote it by P^i,j; the corresponding ciphertext for P^i,j is denoted by C^i,j.
4. Guess a value for (kr⁢19,km⁢19) and do as follows:
  1. For each pair (Ci,j,C^i,j) of ciphertext, partially decrypt it with the guessed (kr⁢19,km⁢19) to get the pair of the 32 bits concerned by the output mask, compute Γ⁢δ.(F1-1⁢(Ci⁢j))⊕Γ⁢δ.(F1-1⁢(C^i⁢j)), i=1,232-1, and increment V[z] when Γ⁢δ.(F1-1⁢(Ci⁢j))⊕Γ⁢δ.(F1-1⁢(C^i⁢j)) is zero.
  2. Compute the statistic
    T=∑i=0232-1(V⁢[z]-N⁢2-m)2N⁢2-m⁢(1-2-m)
    for this distribution.
  3. If the guess for (kr⁢11,km⁢11,kr⁢21,km⁢21,kr⁢19,km⁢19) belongs to the first ϕ guesses for (kr⁢11,km⁢11,kr⁢21,km⁢21,kr⁢19,km⁢19), then record the guess; otherwise, remove the guess with the smallest deviation from the ϕ guesses.

In this attack, we set α0=2-10 (type I error probability, the probability to miss the right key) and α1=2-20 (type II error probability, the probability to accept a wrong key).

The data complexity suggested by Bogdanov in [5, Corollary 2] is 2115.63 distinct plaintext-ciphertexts with those parameters λ=251.63. The success probability of the entire attack is 0.99%.

The time complexity of steps 2(a), 2(c) is λ×2×263×237×2≈2189.63.

The time complexity of step 2(d) is λ×264×237×2×237≈2238.26.

Since α1=2-20 and the total number of recovered key is 111 bits, the number of the remaining subkey values is 2-20×2111=291. Then we exhaustively search other 256-111=145 subkey bits, and the time complexity will be 291+145=2236 times of 33-round encryptions.

6 Conclusions

In this paper, we present a new attack, the multiple differential-zero correlation linear attack. By analyzing the property of the concatenation between forward quad-round and reverse quad-round, we construct a 30-round distinguisher for CAST-256. Based on the distinguisher, we propose a first 33-round attack on CAST-256 according to the number of rounds without the weak-key assumption with data complexity of 2115.63 and time complexity2238.26. In the end, the 111-bit subkey is recovering.

Communicated by Kwangjo Kim

References

[1] E. Biham, O. Dunkelman and N. Keller, Enhancing differential-linear cryptanalysis, Advances in Cryptology – ASIACRYPT 2002, Lecture Notes in Comput. Sci. 2501, Springer, Berlin (2002), 254–266. 10.1007/3-540-36178-2_16Search in Google Scholar

[2] E. Biham and A. Shamir, Differential cryptanalysis of DES-like cryptosystems, Advances in Cryptology – CRYPTO ’90, Lecture Notes in Comput. Sci. 537, Springer, Berlin (1990), 2–21. 10.1007/3-540-38424-3_1Search in Google Scholar

[3] E. Biham and A. Shamir, Differential cryptanalysis of the full 16-round DES, Advances in Cryptology – CRYPTO ’92, Lecture Notes in Comput. Sci. 740, Springer, Berlin (1993), 487–496. 10.1007/3-540-48071-4_34Search in Google Scholar

[4] A. Bogdanov, G. Leander, K. Nyberg and M. Wang, Integral and multidimensional linear distinguishers with correlation zero, preprint (2012), https://www.iacr.org/archive/asiacrypt2012/76580239/76580239.pdf. 10.1007/978-3-642-34961-4_16Search in Google Scholar

[5] A. Bogdanov, G. Leander, K. Nyberg and M. Wang, Integral and multidimensional linear distinguishers with correlation zero, Advances in Cryptology – ASIACRYPT 2012, Lecture Notes in Comput. Sci. 7658, Springer, Berlin (2012), 244–261. 10.1007/978-3-642-34961-4_16Search in Google Scholar

[6] A. Bogdanov and V. Rijmen, Linear hulls with correlation zero and linear cryptanalysis of block ciphers, Des. Codes Cryptogr. 70 (2014), 369–383. 10.1007/s10623-012-9697-zSearch in Google Scholar

[7] A. Bogdanov and M. Wang, Zero correlation linear cryptanalysis with reduced data complexity, Fast Software Encryption – FSE ’12, Lecture Notes in Comput. Sci. 7549, Springer, Berlin (2012), 29–48. 10.1007/978-3-642-34047-5_3Search in Google Scholar

[8] S. K. Langford and M. E. Hellman, Differential-linear cryptanalysis, Advances in Cryptology – CRYPTO ’94, Lecture Notes in Comput. Sci. 839, Springer, Berlin (1994), 17–25. 10.1007/3-540-48658-5_3Search in Google Scholar

[9] M. Matsui, Linear cryptanalysis method for DES cipher, Advances in Cryptology – EUROCRYPT ’93, Lecture Notes in Comput. Sci. 765, Springer, Berlin (1994), 386–397. 10.1007/3-540-48285-7_33Search in Google Scholar

[10] M. Matsui and A. Yamagishi, A new method for known plaintext attack of FEAL cipher, Advances in Cryptology – EUROCRYPT ’92, Lecture Notes in Comput. Sci. 658, Springer, Berlin (1993), 81–91. 10.1007/3-540-47555-9_7Search in Google Scholar

[11] J. J. Nakahara and M. Rasmussen, Linear analysis of reduced-round CAST-128 and CAST-256, Proceedings of the 7th Brazilian Symposium on Information and Computer System Security, Federal University of Rio de Janeiro, Rio de Janeiro (2007), 45–55. 10.5753/sbseg.2007.20914Search in Google Scholar

[12] D. Wagner, The boomerang attack, Fast Software Encryption – FSE ’99, Lecture Notes in Comput. Sci. 1636, Springer, Berlin (1999), 156–170. 10.1007/3-540-48519-8_12Search in Google Scholar

[13] M. Q. Wang, X. Y. Wang and C. H. Hu, New linear cryptanalytic results of reduced-round of CAST-128 and CAST-256, Selected Areas in Cryptography – SAC 2008, Lecture Notes in Comput. Sci. 5381, Springer, Berlin (2009), 429–441. 10.1007/978-3-642-04159-4_28Search in Google Scholar

[14] J. Y. Zhao, M. Q. Wang and L. Wen, Improved linear cryptanalysis of CAST-256, J. Comput. Sci. Tech. 29 (2014), 1134–1139. 10.1007/s11390-014-1496-8Search in Google Scholar

Received: 2016-9-14

Accepted: 2017-2-9

Published Online: 2017-4-21

Published in Print: 2017-6-1

This article is distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Articles in the same Issue

https://doi.org/10.1515/jmc-2016-0054

Keywords for this article

CAST-256; symmetric-key block cipher; differential cryptanalysis; zero correlation linear attack; linear approximation

Creative Commons

BY-NC-ND 3.0