On linear codes with random multiplier vectors and the maximum trace dimension property

Márton Erdélyi; Pál Hegedüs; Sándor Z. Kiss; Gábor P. Nagy

doi:10.1515/jmc-2023-0022

Article Open Access

On linear codes with random multiplier vectors and the maximum trace dimension property

Márton Erdélyi , Pál Hegedüs , Sándor Z. Kiss and Gábor P. Nagy

Published/Copyright: February 14, 2024

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Journal of Mathematical Cryptology Volume 18 Issue 1

Abstract

Let C be a linear code of length n and dimension k over the finite field F q m . The trace code Tr ( C ) is a linear code of the same length n over the subfield F q . The obvious upper bound for the dimension of the trace code over F q is m k . If equality holds, then we say that C has maximum trace dimension. The problem of finding the true dimension of trace codes and their duals is relevant for the size of the public key of various code-based cryptographic protocols. Let C a denote the code obtained from C and a multiplier vector a ∈ ( F q m ) n . In this study, we give a lower bound for the probability that a random multiplier vector produces a code C a of maximum trace dimension. We give an interpretation of the bound for the class of algebraic geometry codes in terms of the degree of the defining divisor. The bound explains the experimental fact that random alternant codes have minimal dimension. Our bound holds whenever n ≥ m ( k + h ) , where h ≥ 0 is the Singleton defect of C . For the extremal case n = m ( h + k ) , numerical experiments reveal a closed connection between the probability of having maximum trace dimension and the probability that a random matrix has full rank.

Keywords: trace codes; subfield subcodes; dimension of trace codes; random alternant codes; weight enumerator; Singleton defect

MSC 2010: 14G50; 15A03

1 Introduction

1.1 Code-based post-quantum cryptosystems

Recent research has focused extensively on quantum computers that use quantum mechanical techniques to solve difficult mathematical computational problems [1]. The existence of these potent devices poses a threat to numerous widely used public-key cryptosystems [2]. McEliece [3] introduced the first code-based public-key cryptosystem in 1978. One of the most pressing problems in cryptography today is to reduce the key size and enhance the security level of the McEliece cryptosystem, which is a promising cryptographic scheme for the post-quantum era [4]. Error-correcting codes used in code-based cryptographic protocols must be decoded with efficient algorithms. The family of algebraic geometry (AG) codes and their subcodes and subfield subcodes constitute a rich class of such codes. These include the generalized Reed–Solomon, alternant, binary Goppa, and BCH codes. For a survey on decoding AG codes, see the research by Høholdt et al. [5].

Couvreur et al. [6–8] provided polynomial-time attacks against the McEliece cryptosystem that employs AG codes or their subcodes. In general, evaluation codes do not operate like random codes. This enables a wide variety of attacks against the McEliece cryptosystem based on AG codes. The technique described by Couvreur et al. [7,8] is inspired by the so-called filtration attacks that rely on computing the dimension of the Schur product that makes AG codes distinguishable from random ones. This observation was used by Wieschebrink [9] to provide an attack against the McEliece scheme based on subcodes of generalized Reed–Solomon codes [10]. Numerous attacks have employed a combination of powerful techniques, such as the filtration method, an error-correcting pair (ECP), or an error-correcting array (ECA), leading to a key recovery attack or a blind reconstruction of a decoding algorithm [7,8,11]. These vulnerabilities are based on the operation of the Schur product and a thorough examination of the dimensions of the Schur products for specific subcodes.

1.2 Key generation of code-based cryptosystems

The key generation process of the code-based scheme starts with a public code C 0 and a decoding algorithm Δ 0 which can efficiently correct a certain number of errors. Then, a random seed σ and a procedure Π are used to construct a code C with a decoding algorithm Δ .

Roughly speaking, the code C = Π ( C 0 , σ ) represents the public key, while the decoding algorithm Δ = Π ( Δ 0 , σ ) represents the private key. The class of random alternant codes, where the starting code C 0 is the full support Reed–Solomon code of dimension k over the field of order q m , serves as an illustration. The random seed consists of a pair of vectors of length n over F q m : the multiplier a = ( a 1 , … , a n ) , ( a i ≠ 0 ) , and the support x = ( x 1 , … , x n ) , ( x i ≠ x j for i ≠ j ). The process Π has two main steps: first, compute the generalized Reed–Solomon code C 1 = GRS k ( x , a ) , then compute the subfield subcode A k ( x , a ) = C 1 ⊥ ∩ F q n of the dual of C 1 . Due to Delsarte’s theorem, the second step is equivalent to taking the dual of the trace code: A k ( x , a ) = Tr ( C 1 ) ⊥ . (For more precise definitions and references, see Section 2).

While binary Goppa codes form a subclass of alternant codes, randomness for binary Goppa codes operates distinctly. One starts with the full support Reed–Solomon code C 0 , where q = 2 and k = 2 t . The seed consists of the support x , and the monic irreducible polynomial g ( X ) of degree t over F q m . The multiplier a is defined by a i = 1 ∕ g ( x i ) , and the result is the alternant code A k ( x , a ) . In both cases, the scheme’s cryptographic strength depends on taking the subfield subcode or, equivalently, taking the trace code. Existing known mathematical techniques have yet failed to grasp the essence of these two operations. In particular, it is difficult to determine the true dimension of subfield subcodes and trace codes in general.

1.3 Random trace codes and their dimension

Subfield subcodes and trace codes are linked by duality. This study deals with the dimension problem of trace codes. Let q be a prime power, and m , k , n and positive integers. We extend the trace map Tr : F q m → F q to vectors and matrices. For a linear subspace C ≤ F q m n , we write Tr ( C ) = { Tr ( x ) ∣ x ∈ C } . For the linear code C ≤ F q m n of dimension k , we have the obvious upper bound dim ( Tr ( C ) ) ≤ m k , and we say that C has maximum trace dimension, if the equality holds. Assume that the F q -linear code C = Π ( C 0 , σ ) is constructed using a F q m -linear code C 0 and a random seed σ . Then, we may inquire about the probability

Prob ( C = Π ( C 0 , σ ) has maximum trace dimension ) ,

a value which depends solely on C 0 . This probability has already been estimated using numerical experimentation for binary Goppa codes of the classic McEliece scheme (see Sections 2.2.2 and 4.2 of [12]), and for random alternant codes [13].

The focus on this probability is mainly theoretical; however, bounds on the proportion of random alternant codes with maximum trace dimension are beneficial in understanding the complexity of the algorithms used in the key generation process of code-based cryptography, as well as the size of public keys.

In this study, we prove a lower bound for the probability of maximum trace dimension in the probability model of random multipliers.

Theorem 1

Let C be an [ n , k , d ] q m -code and let h = n + 1 − k − d be its Singleton defect. Let P C denote the proportion of multiplier vectors a = ( a 1 , … , a n ) ∈ ( F q m * ) n such that the linear code

C a = { ( a 1 x 1 , … , a n x n ) ∣ x ∈ C }

has maximum trace dimension. Then,

(1) P C ≥ 1 − 1 − q − m ( h + k ) ( q − 1 ) q n − m ( h + k ) .

In particular, if n ≥ m ( k + h ) , or equivalently d ≥ n ( 1 − 1 ∕ m ) + 1 , then P C > 0 .

Our proof uses double counting methods that involve the weight distribution of the dual code C ⊥ . We apply recent results of studies by Meneghetti et al. [14] that relate the weight distribution to numerical properties of the code that can be computed if the Singleton defect is small. For our purposes, the most important property is the number of k × v submatrices of rank r of the generator matrix.

Except for the case q = 2 and n = m ( h + k ) , Theorem 1 implies P C ≥ 1 ∕ 2 . This means that the Monte Carlo method of generating a random code C a of maximum trace dimension is very effective. For q = 2 and n = m ( h + k ) , further research is necessary.

If n < m k , then clearly dim ( Tr ( C ) ) ≤ n < m dim ( C ) = m k , so C cannot be of maximum trace dimension for any C . Moreover, if C is an maximum distance separable (MDS) code of length n − h extended with zeros in the last coordinates, then it is easy to see that dim ( Tr ( C a ) ) ≤ n − h . Thus, one might ask for the proportion of multiplier vectors for which dim ( Tr ( C a ) ) is close to the largest possible value n .

Theorem 2

Let C be an [ n , k , d ] q m -code and let h = n + 1 − k − d be its Singleton defect. Let P C ′ denote the proportion of multiplier vectors a = ( a 1 , … , a n ) ∈ ( F q m * ) n such that dim ( Tr ( C a ) ) ≥ n − m h . Then,

(2) P C ′ ≥ q m h + 1 − q m h − q n − m k + q − m k q m h + 1 − 1 .

In particular, if n ≤ m ( k + h ) , or equivalently d ≤ n ( 1 − 1 ∕ m ) + 1 , then P C ′ > 0 .

If h = 0 (thus C is MDS) and n ≤ m k , then the formula in the above theorem obtains simpler and more similar to the one in Theorem 1:

(3) P C ′ ≥ 1 − 1 − q − n ( q − 1 ) q m k − n .

1.4 Maximum trace dimension probabilities of AG codes

AG codes are linear error-correcting codes constructed from algebraic curves over finite fields, generalizing the Reed–Solomon code concept. They are defined by evaluating functions or by using residues of differentials. Their parameters can be derived from well-known theorems in AG. Our notation and terminology on algebraic plane curves over finite fields, their function fields, divisors, and Riemann–Roch spaces are conventional (see, for example, [15]).

Let X be a smooth algebraic curve over the finite field F q m . Let P 1 , P 2 , … , P n be pairwise distinct places of X , and D is the divisor D = P 1 + ⋯ + P n . Let G be another divisor with support disjoint from D . The Riemann–Roch theorem enables us to estimate the dimension, the minimum distance and the Singleton defect of AG codes. These, together with Theorem 1, imply a lower bound for the probability of maximum trace dimension of AG codes.

Theorem 3

Let C = C L ( D , G ) be a functional AG code of length n = deg ( D ) over the finite field F q m , m > 1 . If deg ( G ) ≤ n ∕ m − 1 , then

(4) P C ≥ 1 − 1 − q − m ( deg ( G ) + 1 ) ( q − 1 ) q n − m ( deg ( G ) + 1 ) .

1.5 Rank properties of random matrices in other probabilistic models

The rank properties of random matrices over finite fields have been extensively studied as a problem in combinatorial graph theory and other contexts, including coding theory and code-based post-quantum cryptography. For the probabilistic paradigm, there are a variety of alternatives [16–20]. One possibility is to choose each entry of the matrix independently and uniformly at random from the field. This can be extended to non-uniform distributions, which may or may not depend on the matrix entry’s position. Studholme and Blake [20] studied windowed random matrices, where the nonzero elements of each column are restricted to fall within a window of length w , beginning at a randomly chosen row. Salmond et al. [18] proved that the probability that a random matrix has full rank cannot increase if we fix any number of additional elements to be identically zero.

Let A be an n × n matrix over the finite field F q , whose entries are chosen uniformly at random. As n → ∞ , the probability that A has rank n converges very fast to the value

(5) S ( q ) = ∏ i = 1 ∞ 1 − 1 q i .

S ( q ) , which is independent of n , is also called the q -Pochhamer symbol ( 1 ∕ q ; 1 ∕ q ) ∞ , [21]. For q = 2 , a good estimate for S ( 2 ) is 0.2888. Let V be an F q -space of dimension n , and take n nonzero vectors uniformly at random from V \ { 0 } . The probability that the vectors are linearly independent also converges to S ( q ) very fast if n → ∞ .

We performed numerical experiments for Reed–Solomon codes C = RS k ( x ) over F q m , where k , m are positive integers, q = 2 or q = 3 , and x 1 , … , x k m are random distinct elements of F q m . Therefore, C has a length n = k m , and Singleton defect h = 0 . We observed that the probability that C has maximum trace dimension is near to the value S ( q ) .

1.6 Outline of the article

Notation and classical prerequisites on linear codes are given in Section 2. Section 3 collects basic properties and examples of codes which have maximum trace dimension. In Section 4, we deal with the dimension problem of random alternant codes. Sections 5 and 6 contain detailed proofs of the main theorems. The basic concepts of AG codes are also presented in Section 6.

2 Prerequisites from coding theory

Let q be a prime power and let m , n , k be positive integers such that m k ≤ n ≤ q m . Let x 1 , … , x n be distinct elements of F q m . The Reed–Solomon code RS k ( x ) is defined by the generator matrix

(6) G = 1 1 ⋯ 1 x 1 x 2 ⋯ x n ⋮ ⋮ ⋮ x 1 k − 1 x 2 k − 1 ⋯ x n k − 1 .

The vector x is called the support of the Reed–Solomon code. RS k ( x ) has dimension k and minimum distance d = n − k + 1 . It is an MDS code with Singleton defect h = 0 . Let a 1 , … , a n be nonzero elements of F q m . The generalized Reed–Solomon code GRS k ( x , a ) has generator matrix

G ′ = a 1 a 2 ⋯ a n a 1 x 1 a 2 x 2 ⋯ a n x n ⋮ ⋮ ⋮ a 1 x 1 k − 1 a 2 x 2 k − 1 ⋯ a n x n k − 1 .

Clearly, GRS k ( x , a ) and RS k ( x ) have same parameters. In particular, generalized Reed–Solomon codes are MDS. If n = q m , then { x 1 , … , x n } = F q m and the codes are said to have full support. The dual code of GRS k ( x , a ) is again a generalized Reed–Solomon code GRS n − k ( x , b ) , with the same support x . The Berlekamp–Massey algorithm provides an efficient decoding algorithm for Reed–Solomon codes, which can correct up to ⌊ d − 1 2 ⌋ = ⌊ n − k 2 ⌋ errors. If the multiplier vector is given, then this algorithm can also be used to decode generalized Reed–Solomon codes.

Let C be a linear code of length n , dimension k , and minimum distance d , defined over the finite field F q m . The subfield subcode or restricted code of C is

C ∣ F q = C ∩ F q n .

We extend the trace map Tr : F q m → F q to vectors and matrices entry-wise. We define the trace code of the linear C ≤ F q m n by

Tr ( C ) = { Tr ( x ) ∣ x ∈ C } .

Clearly, Tr ( C ) is an F q -linear code of length n . Let x 1 , … , x k be a basis of C , and let β 1 , … , β m be a basis of F q m over F q . Then, the vectors Tr ( β i x j ) , ( 1 ≤ i ≤ m , 1 ≤ j ≤ k ) span the trace code Tr ( C ) . This implies the obvious upper bound dim ( Tr ( C ) ) ≤ k m for the dimension of the trace code. We say that C has maximum trace dimension, if

dim F q ( Tr ( C ) ) = m dim F q m ( C ) .

According to Delsarte’s theorem [22],

( Tr ( C ) ) ⊥ = ( C ⊥ ) ∣ F q ,

which shows that subfield subcodes and trace codes are basically dual objects. This yields the obvious lower bound

dim ( C ∣ F q ) ≥ n − m ( n − k )

for the dimension of the subfield subcode. The minimum distance of C ∣ F q is at least the minimum distance of C . Moreover, subfield subcodes inherit the decoding algorithms of their parent code.

An alternant code is defined as the subfield subcode of a generalized Reed–Solomon code

A k ( x , a ) = ( GRS k ( x , a ) ⊥ ) ∣ F q ,

or equivalently, as the dual code of the trace code of a generalized Reed–Solomon code

A k ( x , a ) = Tr ( GRS k ( x , a ) ) ⊥ .

The integer k is referred to as the degree of the alternant code, and m as its extension degree. The vector x is the support, and the vector a is the multiplier of the alternant code. In the sequel, even without explicitly saying it, we assume that the entries of the support vector are distinct, and the entries of the multiplier vector are different from zero.

The obvious lower bound for the dimension of the alternant code is

dim ( A k ( x , a ) ) ≥ n − m k .

Given the support and the multiplier, the Berlekamp–Massey algorithm can correct up to k 2 errors for the alternant code A k ( x , a ) .

Recall that the Schur product of the vectors a = ( a 1 , … , a n ) , b = ( b 1 , … , b n ) is defined by

a ⋆ b = ( a 1 b 1 , … , a n b n ) .

3 The maximum trace dimension property

In this section, we prove a collection of properties of codes having the maximum trace dimension. At the end of the section, we present a class of examples which shows that Theorem 1 is close to being sharp asymptotically.

In the sequel, C denotes an F q m -linear code of length n , dimension k , and minimum distance d .

Definition 4

Define the support of C as

(7) supp ( C ) = { i ∈ { 1 , 2 , … , n } ∣ ∃ x ∈ C : x i ≠ 0 } .

For an integer i , define

(8) d i ( C ) = min D ≤ C dim ( D ) = i ∣ supp ( D ) ∣ .

Note that d 1 ( C ) = d is the minimum distance. Clearly, supp ( C ) = supp ( C a ) and d i ( C ) = d i ( C a ) for each multiplier vector a . Furthermore, dim ( C ) ≤ ∣ supp ( C ) ∣ .

The proofs of the following lemmas are straightforward consequences of the definitions.

Lemma 5

The following are equivalent:

The code C has maximum trace dimension.
All F q m -linear subspaces of C have maximum trace dimension.
For all x ∈ C \ { 0 } , Tr ( x ) ≠ 0 .
C ∩ K = { 0 } , where K is the kernel of the trace map Tr : F q m n → F q n .

Lemma 6

Assume that for some multiplier vector a , C a has maximum trace dimension. Then, we have d i ( C ) ≥ i m for all 1 ≤ i ≤ k .

Proof

If D ≤ C , dim ( D ) = i such that ∣ supp ( D ) ∣ < i m , then

supp ( Tr ( D a ) ) ⊆ supp ( D a ) = supp ( D )

and

dim ( Tr ( D a ) ) ≤ ∣ supp ( Tr ( D a ) ) ∣ = ∣ supp ( Tr ( D ) ) ∣ < i m = m dim ( D ) = m dim ( D a ) .

Therefore, D a and C a have no maximum trace dimension.□

We conjecture that the converse of Lemma 6 holds as well.

As the following examples show, the proportion of multiplier vectors producing a maximum trace dimension code is related to the probability of a random matrix to have full rank. Let A be an n × n matrix whose entries are chosen from F q uniformly at random. The probability for A to have maximum rank n is

(9) S 1 ( n , q ) = ∏ i = 1 n 1 − 1 q i .

As n → ∞ , S 1 ( n , q ) converges very fast to the value

(10) S ( q ) = ∏ i = 1 ∞ 1 − 1 q i .

Let V be an F q -space of dimension n , and take n nonzero vectors uniformly at random from V \ { 0 } . The probability for the vectors to be linearly independent is

(11) S 2 ( n , q ) = ∏ j = 0 n − 1 q n − q j q n − 1 = ∏ i = 1 n 1 − 1 q i 1 + 1 q n − 1 n .

As the last factor converges to 1 very fast, S 2 ( n , q ) → S ( q ) very fast (Figures 1 and 2). In fact, if n > 20 , then S ( q ) is a good practical approximation for S 1 ( n , q ) and S 2 ( n , q ) .

Figure 1

q = 2 .

Figure 2

q = 3 .

Lemma 7

Let C be the m-fold repetition code over F q m . The probability that C a has maximum trace dimension for a random multiplier vector a is S 2 ( m , q ) . In practice, if m ≥ 20 , then S ( q ) is a good approximation for this probability.

Let C i be linear [ n i , k i ] q m -codes, i = 1 , 2 . Their sum C 1 + C 2 is a linear [ n 1 + n 2 , k 1 + k 2 ] q m -code whose codewords are ( x 1 , x 2 ) with x i ∈ C i . The minimum distance of the sum is d ( C 1 + C 2 ) = min ( d ( C 1 ) , d ( C 2 ) ) .

Lemma 8

Let C , C ′ be F q m -linear codes, and D = C + C ′ their sum. Then, P D = P C P C ′ , where P C is as defined in Theorem 1.

Let C be the k -fold sum of the m -fold repetition code. Clearly, C has length n = m k , dimension k , and minimum distance d = m . The last two lemmas imply that the proportion P C of multiplier vectors with maximum trace dimension is approximately P C ≈ S ( q ) k , which tends to zero if k → ∞ . In particular, we cannot expect P C to be close to 1 just because k and m are large. However, P C > 0 , so there is a multiplier a such that C a has the maximum trace dimension. On the other hand, d i ( C ) = i m for all 1 ≤ i ≤ k , showing that Lemma 6 is sharp.

Clearly, if n < m k , then P C = 0 . In Theorem 1, we see that n ≥ m ( h + k ) implies P C > 0 . The question whether P C is zero or not is open for the interval [ m k , m ( h + k ) − 1 ] . The following class of examples has Singleton defect h ≈ log q ( k ) , hence the interval is small. Still, the condition n = m k is not enough to ensure P C > 0 . In other words, Theorem 1 is close to being sharp asymptotically.

Proposition 9

For all prime power q and integers m > 2 , 2 ≤ k ≤ q m ∕ m , there exists an F q m -linear code C ′ = C ′ ( q , m , k ) of length n = m k , dimension k, and Singleton bound h = m , such that P C ′ = 0 .

Proof

Let n ′ = m ( k − 1 ) − 1 and let x 1 , … , x n be distinct elements of F q m such that x n ′ + 1 , … , x n ≠ 0 . Let C ′ = C ′ ( q , m , k ) be the code with generator matrix

G ′ = 1 1 ⋯ 1 0 ⋯ 0 x 1 x 2 ⋯ x n ′ 0 ⋯ 0 ⋮ ⋮ ⋮ ⋮ ⋮ x 1 k − 2 x 2 k − 2 ⋯ x n ′ k − 2 0 ⋯ 0 x 1 k − 1 x 2 k − 1 ⋯ x n ′ k − 1 x n ′ + 1 k − 1 ⋯ x n k − 1 .

Let D ′ be the subcode generated by the first k − 1 rows. As k − 1 ≤ n ′ , we have dim ( D ′ ) = k − 1 . Moreover, D ′ has support { 1 , 2 , … , n ′ } , hence ∣ supp ( D ′ ) ∣ = n ′ = m ( k − 1 ) − 1 < m dim ( D ′ ) . Lemma 6 implies that P C ′ = 0 .

Now we compute the minimum distance of C ′ . Take any linear combination c of the rows of G ′ . Write c ′ = ( c 1 , c 2 , … , c n ′ ) and x ′ = ( x 1 , x 2 , … , x n ′ ) . If the last row has zero coefficient, then the last m + 1 coordinates are 0 and c ′ ∈ RS k − 1 ( x ′ ) . So

wt ( c ) ≥ n ′ − ( k − 1 ) + 1 = ( m − 1 ) ( k − 1 ) ,

and equality occurs for some c . If the last row has nonzero coefficient, then the last m + 1 coordinates of c are nonzero and c ′ ∈ RS k ( x ′ ) . So

wt ( c ) ≥ n ′ − k + 1 + m + 1 > ( m − 1 ) ( k − 1 ) .

Thus, the minimal distance of C ′ ( m , k ) is indeed d = ( m − 1 ) ( k − 1 ) and h = m .□

4 The dimension of random alternant codes

In numerical experiments, one observes that the dimension of random alternant codes typically attains the obvious lower bound [13]. In this short section, we derive a proof for this observation from Theorem 1. We show that if the length of the random alternant code exceeds m k , then the dimension is n − m k with high probability. In particular, this is the case for most random alternant codes of full support.

Definition 10

Given the field of definition F q , the degree k , and the extension degree m , the random alternant code is a code A k ( x , a ) , where the support x and the multiplier a are chosen uniformly at random.

Proposition 11

Let q be a prime power and m , n , k be positive integers such that m k ≤ n ≤ q m . The random alternant code of length n, degree k, extension degree m over F q has dimension n − m k with probability at least

1 − 1 − q − m k ( q − 1 ) q n − m k .

Proof

The dual of the alternant code is Tr ( GRS k ( x , a ) ) . Since GRS k ( x , a ) is MDS of dimension k , Theorem 3 implies the proposition.□

5 Proof of Theorems 1 and 2

In this section, we use the notation of Theorem 1. We describe the average cardinality of Tr ( C a ) ⊥ , a ∈ ( F q m * ) n , with the help of the weight distribution of the dual code C ⊥ . Let us introduce the following notation:

Definition 12

Let wt : C → N denote the Hamming weight and

(12) B w = ∣ { c ∈ C ⊥ ∣ wt ( c ) = w } ∣

for 0 ≤ w ≤ n the weight distribution of C . Then, let

(13) λ ( C ) = ∑ w = 0 n B w q − 1 q m − 1 w .

For 0 ≤ r ≤ v ≤ n , let

(14) N G ( v , r ) = ∣ { k × v submatrices of G with rank r } ∣ ,

where G ∈ F q m k × n is a generator matrix of C .

Proposition 13

We have the following average form:

(15) λ ( C ) = 1 ∣ ( F q m * ) n ∣ ∑ a ∈ ( F q m * ) n q n − dim ( Tr ( C a ) ) .

Proof

For a ∈ ( F q m * ) n , we write a − 1 = ( a j − 1 ) 1 ≤ j ≤ n . We double-count the set

(16) H = { ( a , c ) ∣ a − 1 ⋆ c ∈ F q n , a ∈ ( F q m * ) n , c ∈ C ⊥ } .

For any fixed a , ( a , c ) ∈ H if and only if a − 1 ⋆ c ∈ ( C ⊥ ) a − 1 ∩ F q n . By Delsarte’s theorem [22, Theorem 2], we have

(17) ( C ⊥ ) a − 1 ∩ F q n = ( C a ) ⊥ ∩ F q n = ( Tr ( C a ) ) ⊥ .

Hence, ∣ ( C ⊥ ) a − 1 ∩ F q n ∣ = q n − dim ( Tr ( C a ) ) . This proves

(18) ∣ H ∣ = ∑ a ∈ ( F q m * ) n q n − dim ( Tr ( C a ) ) .

Let us now fix c ∈ C ⊥ . For each j , we have

(19) { a j ∈ F q m * ∣ a j − 1 c j ∈ F q } = F q m * , if c j = 0 ; c j F q * , if c j ≠ 0 .

Thus,

(20) ∣ { a ∈ ( F q m * ) n ∣ a − 1 ⋆ c ∈ F q n } ∣ = ( q − 1 ) wt ( c ) ( q m − 1 ) n − wt ( c ) ,

summing over all c ∈ C ⊥ , we obtain ∣ H ∣ = ( q m − 1 ) n λ ( C ) .□

As dim ( Tr ( C a ) ) ≤ k m , each summand on the right-hand side of (15) is at least q n − k m . This gives a lower bound

(21) λ ( C ) ≥ q n − k m .

The upper bounds of λ ( C ) can be used to find lower bounds on the proportion P C of multiplier vectors which produce maximum trace dimension codes and on the proportion P C ′ of multiplier vectors which produce trace codes with dimension at least n − m h .

Proposition 14

Assume λ ( C ) ≤ q n − k m + E , where E is nonnegative. Then,
(22) P C ≥ 1 − E ( q − 1 ) q n − k m .
Assume λ ( C ) ≤ q m h + 1 − E ′ . Then,
(23) P C ′ ≥ E ′ q m h + 1 − 1 .

Proof

If C a does not have maximum trace dimension, then the corresponding summand in (15) is at least q n − k m + 1 . Therefore, P C q n − k m + ( 1 − P C ) q n − k m + 1 ≤ λ ( C ) . The first claim follows from a straightforward computation.

In a similar manner, if dim ( Tr ( C a ) ) < n − m h , then the corresponding summand in (15) is at least q m h + 1 ; otherwise, it is at least 1. Therefore, P C ′ + ( 1 − P C ′ ) q m h + 1 ≤ λ ( C ) , hence the second claim.□

Proposition 15

(24) λ ( C ) = q m − q q m − 1 n ∑ v = 0 n q − 1 q m − q v ∑ r = 0 v N G ( v , r ) q m ( v − r ) ,

where N G ( v , r ) is as defined in (14).

Proof

Applying Proposition 3 of [14] for C ⊥ over F q m , we obtain

(25) ∑ s = 0 v n − s v − s B s = ∑ r = 0 v N G ( v , r ) q m ( v − r ) .

Multiplying with x v and summing over 0 ≤ v ≤ n , we obtain

(26) ∑ v = 0 n x v ∑ s = 0 v n − s v − s B s = ∑ v = 0 n x v ∑ r = 0 v N G ( v , r ) q m ( v − r ) .

Changing the order of the summation and using the binomial theorem, the left hand side is

(27) ∑ s = 0 n B s x s ∑ v = s n n − s v − s x v − s = ∑ s = 0 n B s x s ( 1 + x ) n − s .

Let us put x = q − 1 q m − q , thus 1 + x = q m − 1 q m − q . Then, by the definition,

(28) λ ( C ) = ∑ s = 0 n B s ( q − 1 ) s ( q m − 1 ) n − s ( q m − 1 ) n = q m − q q m − 1 n ∑ s = 0 n B s x s ( 1 + x ) n − s .

By (26), we have

(29) λ ( C ) = q m − q q m − 1 n ∑ v = 0 n x v ∑ r = 0 v N G ( v , r ) q m ( v − r ) ,

hence the proposition.□

Proof of Theorems 1 and 2

Applying Lemma 4 in [14] for C ⊥ , all k × ( k + h ) submatrix of G has rank k . It follows that the rank of all k × v submatrix equals k if v ≥ k + h and is at least v − h if v < k + h .

By using this observation, we can bound the inner sum on the right hand side of the previous proposition:

For v ≥ k + h , we have N G ( v , r ) = 0 for r < k and
(30) ∑ r = 0 v N G ( v , r ) q m ( v − r ) = n v q m ( v − k ) ,
for v < k + h , we have N G ( v , r ) = 0 for r < v − h and
(31) ∑ r = 0 v N G ( v , r ) q m ( v − r ) ≤ n v q m h .

In view of (30), (31), and x = q − 1 q m − q , we obtain

λ ( C ) ≤ q m − q q m − 1 n ∑ v = 0 k + h − 1 n v x v q m h + ∑ v = k + h n n v x v q m ( v − k ) = 1 q m k q m − q q m − 1 n ∑ v = 0 n n v ( x q m ) v + ∑ v = 0 k + h − 1 n v x v ( q m ( h + k ) − q m v ) ≤ 1 q m k q m − q q m − 1 n ( 1 + x q m ) n + ( q m ( h + k ) − 1 ) ∑ v = 0 k + h n v x v ≤ 1 q m k q m − q q m − 1 n ( ( 1 + x q m ) n + ( q m ( h + k ) − 1 ) ( 1 + x ) n ) .

As 1 + x = q m − 1 q m − q and 1 + x q m = q ⋅ q m − 1 q m − q , we obtain

(32) λ ( C ) ≤ q n − m k + ( q m h − q − m k ) .

Set E = q m h − q − m k and

E ′ = q m h + 1 − ( q n − m k + q m h − q − m k ) .

By using Proposition 14, we obtain the lower bounds on P C and P C ′ as in the statements. If n ≤ m ( k + h ) , then E ′ ≥ ( q − 2 ) q m h + q − m k > 0 , thus P C ′ > 0 .□

6 Proof of Theorem 3

AG codes are linear error-correcting codes constructed from algebraic curves over finite fields, generalizing the Reed–Solomon code concept. They are defined by evaluating functions or by using residues of differentials. Their parameters can be derived from well-known AG theorems. Our notation and terminology on algebraic plane curves over finite fields, their function fields, divisors, and Riemann–Roch spaces are conventional (see, for example, [15]).

Let X be an algebraic curve, that is, an affine or projective variety of dimension one, which is absolutely irreducible and non-singular and whose defining equations are (homogeneous) polynomials with coefficients in F q . Let g = g ( X ) be the genus of X . F q ( X ) denotes the function field of X . A divisor D of X is a formal sum D = n 1 P 1 + ⋯ + n k P k , where n 1 , … , n k ∈ Z and P 1 , … , P k are places of F q ( X ) . If n 1 , … , n k ≥ 0 , then D ≽ 0 . If D , E are two divisors and D − E ≽ 0 , then D ≽ E . In the case of a nonzero function f of the function field F q ( X ) , and a place P , v P ( f ) stands for the order of f at P . If v P ( f ) > 0 , then P is a zero of f , while if v P ( f ) < 0 , then P is a pole of f with multiplicity − v P ( f ) . The principal divisor of a nonzero function f is Div ( f ) = ∑ P v P ( f ) P .

For a divisor D , the associated Riemann–Roch space L ( D ) is the vector space

L ( D ) = { f ∈ F q ( X ) \ { 0 } ∣ Div ( f ) ≽ − D } ∪ { 0 } .

The dimension ℓ ( D ) of L ( D ) is given by the Riemann–Roch Theorem [15, Theorem 1.1.15]:

ℓ ( D ) = ℓ ( W − D ) + deg D − g + 1 ,

where W is a canonical divisor. We denote the set of differentials on X by Ω . The differential space of the divisor D is

Ω ( D ) = { d h ∈ Ω ∣ Div ( d h ) ≽ A } ∪ { 0 } .

In the following, P 1 , P 2 , … , P n are pairwise distinct places on X , and D is the divisor D = P 1 + ⋯ + P n . Let G be another divisor with support disjoint from D . We define two types of AG codes, the functional and the differential codes, respectively:

C L ( D , G ) = { ( f ( P 1 ) , … , f ( P n ) ) ∣ f ∈ L ( G ) } , C Ω ( D , G ) = { ( res P 1 ( ω ) , … , res P n ( ω ) ) ∣ ω ∈ Ω ( G − D ) } .

These codes are dual to each other, and C Ω ( D , G ) = C L ( D , K + D − G ) for a well-chosen canonical divisor K . The Riemann–Roch theorem enables us to estimate the dimension and the minimum distance of AG codes:

dim ( C L ( D , G ) ) ≥ deg ( G ) − g + 1 0 ≤ deg ( G ) ≤ 2 g − 2 , = deg ( G ) − g + 1 2 g − 2 ≤ deg ( G ) ≤ n , ≤ deg ( G ) − g + 1 n ≤ deg ( G ) ≤ n + 2 g − 2 .

The minimum distance of a functional code is at least its designed minimum distance

δ L = n − deg ( G ) .

Proof of Theorem 3

Let k be the dimension, and h be the Singleton defect of the AG code C = C L ( D , G ) . Then h + k = n + 1 − d ≤ n + 1 − δ L = deg ( G ) + 1 . As the right hand side of (1) is monotone decreasing in h + k , the formula (4) follows.□

7 Conclusion

We gave a lower bound for the probability that the dimension of the trace code of a linear code with a random multiplier vector attains the obvious upper bound. This is exactly the type of question that requires solid mathematical understanding for McEliece-type cryptographic protocols. Our formula only uses the size of the underlying field, the degree of the field extension, and the three main parameters of the code: length, dimension, and minimum distance. The result provided a concise formula for the probability that an AG code has maximum trace dimension. We also proved by mathematical means that full support random alternant codes have dimension n − m k with high probability. These pieces of information are useful to understand better the complexity of Monte Carlo algorithms in the key generation process of code-based cryptosystems. This provides insights into the practicality and performance of the cryptosystem in real-world applications, in particular in resource limited devices like sensor nodes or smart cards.

Our approach works for the probabilistic model of random multiplier vectors. Random Goppa codes have a different probability paradigm. Therefore, our results do not solve the dimension problem for random Goppa codes. This needs further research, but we are optimistic that our method can be extended.

Acknowledgments

This research was supported by the Ministry of Culture and Innovation and the National Research, Development, and Innovation Office within the Quantum Information National Laboratory of Hungary (Grant No. 2022-2.1.1-NL-2022-00004), and partially funded by NKFIH Grants K129335, K138596, K135885, and FK127906. This work has been accepted for presentation at CIFRIS23, the Congress of the Italian association of cryptography “De Componendis Cifris.”

Conflict of interest: Authors state no conflict of interest.

References

[1] Arute F, Arya K, Babbush R, Bacon D, Bardin JC, Barends R, et al. Quantum supremacy using a programmable superconducting processor. Nature. 2019;574(7779):505–10. https://doi.org/10.1038/s41586-019-1666-5. Search in Google Scholar PubMed

[2] Shor PW. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM J Comput. 1997;26(5):1484–509. https://doi.org/10.1137/S0097539795293172. Search in Google Scholar

[3] McEliece RJ. A public-key cryptosystem based on algebraic coding theory. DSN Progress Report, 42–44:114–116, 1978. Search in Google Scholar

[4] National Institute of Standards, Technology. Post-Quantum Cryptography; Updated: March 25. 2020. http://csrc.nist.gov/projects/post-quantum-cryptography. Search in Google Scholar

[5] Høholdt T, Van Lint JH, Pellikaan R. Algebraic geometry codes. Handbook of coding theory. 1998;1(Part 1):871–961. Search in Google Scholar

[6] Couvreur A, Márquez-Corbella I, Pellikaan R. Cryptanalysis of public-key cryptosystems that use subcodes of algebraic geometry codes. In: Coding theory and applications. Cham: Springer; 2015. p. 133–40. 10.1007/978-3-319-17296-5_13Search in Google Scholar

[7] Couvreur A, Márquez-Corbella I, Pellikaan R. Cryptanalysis of McEliece cryptosystem based on algebraic geometry codes and their subcodes. IEEE Trans Inform Theory. 2017;63(8):5404–18. https://doi.org/10.1109/TIT.2017.2712636. Search in Google Scholar

[8] Couvreur A, Otmani A, Tillich JP. Polynomial time attack on wild McEliece over quadratic extensions. IEEE Trans Inform Theory. 2016;63(1):404–27. 10.1109/TIT.2016.2574841Search in Google Scholar

[9] Wieschebrink C. Cryptanalysis of the Niederreiter public key scheme based on GRS subcodes. In: International Workshop on Post-Quantum Cryptography. Springer; 2010. p. 61–72. 10.1007/978-3-642-12929-2_5Search in Google Scholar

[10] Berger TP, Loidreau P. How to mask the structure of codes for a cryptographic use. Des Code Cryptogr. 2005;35(1):63–79. 10.1007/s10623-003-6151-2Search in Google Scholar

[11] Couvreur A, Gaborit P, Gauthier-Umannna V, Otmani A, Tillich JP. Distinguisher-based attacks on public-key cryptosystems using Reed–Solomon codes. Des Code Cryptogr. 2014;73(2):641–66. 10.1007/s10623-014-9967-zSearch in Google Scholar

[12] Albrecht MR, Bernstein DJ, Chou T, Cid C, Gilcher J, Lange T, et al. Classic McEliece: conservative code-based cryptography; 2020. https://classic.mceliece.org/nist/mceliece-20201010.pdf. Search in Google Scholar

[13] Mora R, Tillich JP. On the dimension and structure of the square of the dual of a Goppa code. Des Codes Cryptogr. 2023;91(4):1351–72. https://doi.org/10.1007/s10623-022-01153-w. Search in Google Scholar

[14] Meneghetti A, Pellegrini M, Sala M. A formula on the weight distribution of linear codes with applications to AMDS codes. Finite Fields Appl. 2022;77:Paper No. 101933, 15. https://doi.org/10.1016/j.ffa.2021.101933. Search in Google Scholar

[15] Stichtenoth H. Algebraic function fields and codes. Vol. 254 of Graduate Texts in Mathematics. 2nd edn. Berlin: Springer-Verlag; 2009. 10.1007/978-3-540-76878-4Search in Google Scholar

[16] Cooper C. On the distribution of rank of a random matrix over a finite field. In: Proceedings of the Ninth International Conference “Random Structures and Algorithms” (Poznan, 1999). Vol. 17. 2000. p. 197–212. https://doi.org/10.1002/1098-2418(200010/12)17:3/4<197::AID-RSA2>3.3.CO;2-B. Search in Google Scholar

[17] Cooper C. On the rank of random matrices. Random Struct Algorithms. 2000;16(2):209–32. https://doi.org/10.1002/(SICI)1098-2418(200003)16:2<209::AID-RSA6>3.3.CO;2-T. Search in Google Scholar

[18] Salmond D, Grant A, Grivell I, Chan T. On the rank of random matrices over finite fields; 2016. Search in Google Scholar

[19] Studholme C, Blake IF. Properties of random matrices and applications; 2006. http://www.cs.toronto.edu/ cvs/coding/random_report.pdf. Search in Google Scholar

[20] Studholme C, Blake IF. Random matrices and codes for the erasure channel. Algorithmica. 2010;56(4):605–20. https://doi.org/10.1007/s00453-008-9192-0. Search in Google Scholar

[21] Wikipedia contributors. Q-Pochhammer symbol – Wikipedia, The Free Encyclopedia; 2022. [Online; accessed 27-January-2023]. https://en.wikipedia.org/w/index.php?title=Q-Pochhammer_symbololdid=1109461763. Search in Google Scholar

[22] Delsartre P. On Subfield Subcodes of Modified Reed–Solomon Codes. IEEE Trans Inform Theory. 1975;21(5):575–6. 10.1109/TIT.1975.1055435Search in Google Scholar

Received: 2023-09-04

Revised: 2023-10-22

Accepted: 2023-10-31

Published Online: 2024-02-14

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

https://doi.org/10.1515/jmc-2023-0022

Keywords for this article

trace codes; subfield subcodes; dimension of trace codes; random alternant codes; weight enumerator; Singleton defect

Creative Commons

BY 4.0