Quantum algorithms for computing general discrete logarithms and orders with tradeoffs

Martin Ekerå

doi:10.1515/jmc-2020-0006

Artikel Open Access

Quantum algorithms for computing general discrete logarithms and orders with tradeoffs

Martin Ekerå

Veröffentlicht/Copyright: 22. April 2021

Veröffentlicht von

Veröffentlichen auch Sie bei De Gruyter Brill

Manuskript einreichen Informationen für Autor*innen

Aus der Zeitschrift Journal of Mathematical Cryptology Band 15 Heft 1

Abstract

We generalize our earlier works on computing short discrete logarithms with tradeoffs, and bridge them with Seifert's work on computing orders with tradeoffs, and with Shor's groundbreaking works on computing orders and general discrete logarithms. In particular, we enable tradeoffs when computing general discrete logarithms. Compared to Shor's algorithm, this yields a reduction by up to a factor of two in the number of group operations evaluated quantumly in each run, at the expense of having to perform multiple runs. Unlike Shor's algorithm, our algorithm does not require the group order to be known. It simultaneously computes both the order and the logarithm. We analyze the probability distributions induced by our algorithm, and by Shor's and Seifert's order-finding algorithms, describe how these algorithms may be simulated when the solution is known, and estimate the number of runs required for a given minimum success probability when making different tradeoffs.

Keywords: Discrete logarithms; Order finding; Shor's algorithms

MSC 2010: 68Q12; 81P68; 94A60

1 Introduction

As in [5, 7, 8], let 𝔾 under ⊙ be a finite cyclic group of order r generated by g, and

x=[d] g=g⊙g⊙⋯⊙g︸d times.

The discrete logarithm problem is to compute d = log_g x given the group elements g and x. In cryptographic applications, the group 𝔾 is typically a subgroup of Fp* , for some prime p, or an elliptic curve group.

In the general discrete logarithm problem 0 ≤ d < r, whereas d is smaller than r by some order of magnitude in the short discrete logarithm problem.

1.1 Earlier works

In 1994, in a groundbreaking publication, Shor [25, 26] introduced polynomial time quantum algorithms for factoring integers and for computing general discrete logarithms in Fp* . The latter algorithm may be trivially adapted to compute general discrete logarithms in any finite cyclic group, provided that the group operation can be implemented efficiently quantumly.

Ekerå [5] initiated a line of research in 2016 by introducing a modified version of Shor's algorithm for computing discrete logarithms that more efficiently solves the short discrete logarithm problem. This work is of cryptographic relevance as the short discrete logarithm problem underpins the security of many implementations of cryptosystems instantiated with safe-prime groups. A notable example is Diffie-Hellman key exchange [3] in TLS, IKE and NIST SP 800-56A [2, 9, 13].

In a follow-up work, Ekerå and Håstad [8] enabled tradeoffs in Ekerå's algorithm using ideas that directly parallel those of Seifert [24] in his work on enabling tradeoffs in Shor's order-finding algorithm; the quantum part of Shor's factoring algorithm. Ekerå and Håstad furthermore showed how the RSA integer factoring problem, that underpins the widely deployed RSA cryptosystem [21], may be reduced via [11] to a short discrete logarithm problem and attacked quantumly. This gives rise to a quantum algorithm that more efficiently solves the RSA integer factoring problem when making tradeoffs and comparing to Shor's original factoring algorithm, or to Seifert's factoring algorithm.

Ekerå [7] subsequently refined the classical post-processing in [8] to render it more efficient. With this improved post-processing, the algorithm of Ekerå and Håstad is shown in [7] to outperform Shor's and Seifert's factoring algorithms when targeting RSA integers, irrespective of whether tradeoffs are made.

A key component to this result was the development of a classical simulator for the quantum algorithm for computing short discrete logarithms: For problem instances for which the solution is classically known, this simulator allows outputs to be generated that are representative of outputs that would be generated by the quantum algorithm if executed on a quantum computer. This in turn allows the efficiency of the classical post-processing to be experimentally assessed.

1.2 Our contributions

We generalize and bridge our earlier works on computing short discrete logarithms with tradeoffs, Seifert's work on computing orders with tradeoffs and Shor's groundbreaking works on computing orders and general discrete logarithms. In particular, we enable tradeoffs when computing general discrete logarithms.

Compared to Shor's algorithm for computing general discrete logarithms, this yields a reduction by up to a factor of two in the number of group operations evaluated quantumly in each run, at the expense of having to perform multiple runs. Unlike Shor's algorithm, our algorithm does not require the group order to be known. It simultaneously computes both the order and the logarithm. This allows our algorithm to outperform Shor's original algorithms with respect to the overall number of group operations that need to be evaluated quantumly in some cases even when not making tradeoffs. One cryptographically relevant example of such a case is the computation of discrete logarithms in Schnorr groups of unknown order.

We analyze the probability distributions induced by our algorithm, and by Shor's and Seifert's order-finding algorithms, describe how all of these algorithms may be simulated when the solution to the problem instance is known, and estimate the number of runs required for a given minimum success probability when making different tradeoffs.

1.2.1 On the cryptographic significance of this work

The security of virtually all currently widely deployed asymmetric cryptosystems is based on the intractability of either the discrete logarithm problem or the integer factoring problem.

In this work, we further the understanding of how hard these two key problems are to solve quantumly when not on special form. We hope that our results may prove useful when developing cost estimates for quantum attacks, and that they may inform decisions on when to mandate migration from the currently deployed asymmetric cryptosystems to post-quantum secure cryptosystems.

1.2.2 Further details and overview

Our algorithm for computing general discrete logarithms in turn consists of two algorithms;

a quantum algorithm, that upon input of a generator g of order r, and an element x = [d] g where 0 ≤ d < r, outputs a pair (j, k), and
a classical probabilistic post-processing algorithm, that upon input of a set of n pairs (j, k), produced by n runs of the quantum algorithm, computes d.

In addition to the above post-processing algorithm, we furthermore specify

a classical probabilistic post-processing algorithm, that upon input of a set of n integers j computes the order r. Note that the same set of integers j may be used as input to both this and the above post-processing algorithm, by breaking out j from the pairs (j, k).

The quantum algorithm is identical to the algorithm in [7, 8] for computing short discrete logarithms with tradeoffs. The key difference in this work is that we admit general discrete logarithms and comprehensively analyze the probability distribution that the algorithm induces for such logarithms.

The post-processing algorithm for d is a tweaked version of the lattice-based algorithm in [7], whereas the algorithm for r is a natural generalization of the lattice-based algorithm in [7] first sketched in a pre-print of [8]. It is similar to the post-processing in [24].

The quantum algorithm is parameterized under a tradeoff factor s. This factor controls the tradeoff between the requirements that the algorithm imposes on the quantum computer, and the number of runs, n, required to attain a given minimum probability q of recovering d and r in the classical post-processing.

Following [7], we estimate n for a given problem instance, represented by d and r, and fixed s and q, by simulating the quantum algorithm. We first use simulated output to heuristically estimate n, and then verify the estimate by executing the two post-processing algorithms with respect to simulated output.

The simulator is based on a high-resolution two-dimensional histogram of the probability distribution induced by the quantum algorithm. By sampling the histogram, we generate pairs (j, k) that very closely approximate output that would be produced by the quantum algorithm if executed on a quantum computer.

To construct the histogram, we first derive a closed-form expression that approximates the probability of the quantum algorithm yielding (j, k) as output, and an upper bound on the error in the approximation. We then integrate this expression and the error bound numerically in different regions of the plane.

Our simulations show that when not making tradeoffs, a single run suffices to compute d or r with ≥ 99% success probability. When making tradeoffs, slightly more than s runs are typically required to achieve a similar success probability. In Appendix A we show that these results extend to order finding and factoring.

Note that the simulator requires d and r to be explicitly known: It cannot be used for problem instances represented by group elements g and x = [d] g.

1.2.3 Structure of this paper

The quantum algorithm is described in Section 2. In Section 3, we analyze the probability distribution it induces, and derive a closed-form expression that approximates the probability of it yielding (j, k) as output. In Sections 4 and 5, we describe how the high-resolution histogram is constructed by integrating the closed-form expression, and how the histogram is sampled to simulate the quantum algorithm.

In Section 6, we describe the two post-processing algorithms for recovering d and r from a set of n pairs (j, k). In Section 7, we use the simulator to estimate the number of runs n required to solve a given problem instance for d and r, with minimum success probability q, as a function of the tradeoff factor s.

We summarize past and new results, and discuss related applications, such as order finding and integer factoring, in Sections 8 and 9, and in the appendices.

1.3 Notation

The below notation is used throughout this paper:

u mod n denotes u reduced modulo n constrained to [0, n).
{u}_n denotes u reduced modulo n constrained to [−n/2, n/2).
⌈u⌉, ⌊u⌋ and ⌊u⌉ denotes u rounded up, down and to the closest integer.
|a+ib|=a2+b2 where a, b ∈ ℝ denotes the Euclidean norm of a + ib.
|u| denotes the Euclidean norm of the vector u = (u₀, . . ., u_n−1) ∈ ℝⁿ.
u ~ v is used to denote that u and v are approximately of similar size.

1.4 Randomization

Given two group elements g and x′ = [d′] g to be solved for d′, the general discrete logarithm problem may be randomized as follows:

Select a random integer t. Let x = x′ ⊙ [t] g = [d] g.
Solve g and x for d ≡ d′ + t (mod r) and optionally for r.
Compute and return d′ ≡ d − t (mod r).

Hence, we may assume without loss of generality that d is selected uniformly at random on [0, r) in the analysis of the quantum algorithm.

If r is known, t should be selected uniformly at random on [0, r), otherwise on [0, 2^m+c) for m the bit length of r and c a sufficiently large integer constant for the selection of x to be indistinguishable from a uniform selection from 𝔾. Solving for r in step 2 is only necessary if r is unknown and d′ must be on [0, r) when returned.

2 The quantum algorithm

In this section we describe the quantum algorithm, that upon input of a generator g and an element x = [d] g, where 0 ≤ d < r, outputs a pair (j, k) and element y.

As stated earlier, the algorithm is parameterized under a small integer constant s ≥ 1, referred to as the tradeoff factor, that controls the tradeoff between the number of runs required and the requirements imposed on the quantum computer.

Let m be the integer such that 2^m−1 ≤ r < 2^m, let ℓ = ⌈m/s⌉, and let
Ψ=12m+2ℓ∑a=02m+ℓ−1 ∑b=02ℓ−1| a,b,0 〉.
Compute [a] g ⊙ [−b] x = [a − bd] g to the third register to obtain
Ψ=12m+2ℓ∑a=02m+ℓ−1 ∑b=02ℓ−1| a,b,[a−bd] g 〉.
Compute QFTs of size 2^m+ℓ and 2^ℓ of the first two registers to obtain
Ψ=12m+2ℓ∑a=02m+ℓ−1 ∑b=02ℓ−1 ∑j=02m+ℓ−1 ∑k=02ℓ−1e2π i (aj+2mbk)/2m+ℓ| j,k,[a−bd] g 〉.
Observe the system to obtain (j, k) and y = [e] g where e = (a − bd) mod r.
Note that y is observed only to highlight that the system is forced to collapse to combinations of a and b such that e = (a − bd) mod r for fixed e.

The above steps may be interleaved, rather than executed sequentially, so as to allow the qubits in the first two registers to be recycled [10, 18, 19]. A single control qubit then suffices to implement the first two control registers. This is possible as the qubits in the control registers are not initially entangled; the registers are initialized to uniform superpositions of 2^m+ℓ and 2^ℓ values, respectively.

In Shor's algorithm for computing general discrete logarithms, the two control registers are instead of length m qubits. Both registers are initialized to uniform superpositions of r values. This makes the single control qubit optimization less straightforward to apply, and the initial superpositions harder to induce. Apart from this difference, the implementation complexity of Shor's algorithm and our algorithm may be compared in a fair manner in terms of the total exponent lengths.

In practice, the exponentiation of group elements would typically be performed by computing a group operation controlled by each bit in the exponent. Hence, a total of 2m group operations are performed in Shor's algorithm, compared to m + 2m/s in our algorithm. As s increases, this tends to m operations, providing an advantage over Shor's original algorithm by up to a factor of two at the expense of having to execute the algorithm multiple times. This reduction in the number of group operations translates into a corresponding reduction in the coherence time and circuit depth requirements of our quantum algorithm.

Note that our algorithm does not require r to be known. It suffices that the size of r is known. For comparison, Shor's algorithm requires r to be known. This explains why Shor needs to perform only 2m operations, whilst we need 3m operations when not making tradeoffs. As we shall see, we do in fact compute both d and r simultaneously, whilst Shor computes d given r.

3 The probability of observing (j, k) and y

In step 4 of the algorithm in Section 2, we obtain (j, k) and y = [e] g with probability

(1) 122(m+2ℓ)|∑a ∑bexp[2πi2m+ℓ (aj+2mbk)]|2

where the sum is over all pairs (a, b), such that 0 ≤ a < 2^m+ℓ and 0 ≤ b < 2^ℓ, respecting the condition e ≡ a − bd (mod r). In this section, we seek a closed-form error-bounded approximation to (1) summed over all y = [e] g ∈ 𝔾. To this end, we first perform a variable substitution to obtain contiguous summation intervals.

As a = e + bd + n_r r for n_r an integer such that 0 ≤ a = e + bd + n_r r < 2^m+ℓ, it follows that

(2) ⌈−(e+bd)/r⌉≤nr<⌈(2m+ℓ−(e+bd))/r⌉.

Substituting a for e + bd + n_r r in (1) and adjusting the phase therefore yields

(3) 122(m+2ℓ)|∑b=02ℓ−1 ∑nr=⌈−(e+bd)/r⌉⌈(2m+ℓ−(e+bd))/r⌉−1exp[2πi2m+ℓ(nrrj+b(dj+2mk))]|2.

By introducing arguments α_d and α_r, and corresponding angles θ_d and θ_r, where

αd={dj+2mk}2m+ℓ αr={rj}2m+ℓ θd=θ(αd)=2παd2m+ℓ θr=θ(αr)=2παr2m+ℓ

we may write (3) as a function of α_d and α_r, and e, as

(4) 122(m+2ℓ)|∑b=02ℓ−1∑nr=⌈−(e+bd)/r⌉⌈(2m+ℓ−(e+bd))/r⌉−1exp[2πi2m+ℓ(nrαr+bαd)]|2

or of θ_d and θ_r, and e, as

(5) ρ(θd,θr,e)=122(m+2ℓ)|∑b=02ℓ−1eiθdb∑nr=⌈−(e+bd)/r⌉⌈(2m+ℓ−(e+bd))/r⌉−1eiθrnr|2.

This implies that the probability of observing the pair (j, k) and y = [e] g depends only on (α_d, α_r) and e, or equivalently on (θ_d, θ_r) and e. The probability is virtually independent of e in practice, as e can at most shift the endpoints of the summation interval in the inner sums in (4) and (5) by one step.

As was stated above, we seek a closed-form approximation to ρ(θ_d, θ_r, e) summed over all r group elements y = [e] g ∈ 𝔾. Hereinafter, we denote this probability

P(θd,θr)=∑e=0r−1ρ(θd,θr,e)=122(m+2ℓ)∑e=0r−1 |∑b=02ℓ−1eiθdb∑nr=⌈−(e+bd)/r⌉⌈(2m+ℓ−(e+bd))/r⌉−1eiθrnr|2,

and we furthermore use angles and arguments interchangeably, depending on which representation best lends itself to analysis in each step of the process.

3.1 Preliminaries

To gain some intuition, we write ρ(θ_d, θ_r, e) as

122(m+2ℓ)|∑b=02ℓ−1ei(θdb+θr⌈−(e+bd)/r⌉) ∑nr=0⌈(2m+ℓ−(e+bd))/r⌉−⌈−(e+bd)/r⌉−1eiθrnr|2

and note that there are two obstacles to placing this expression on closed form:

Firstly, the summation interval in the inner sum over n_r depends on the summation variable b of the outer sum. Secondly, the exponent of the summand in the outer sum over b contains a rounding operation that depends on b.

By using that ⌈(2^m+ℓ − (e + bd))/r⌉ − ⌈−(e + bd)/r⌉ ≈ ⌈2^m+ℓ/r⌉ we may remove the dependency between the inner and outer sums, and by using that ⌈−(e + bd)/r⌉ ≈ −(e + bd)/r we may remove the rounding operation.

By making these two approximations, and by adjusting the phase, we may derive an approximation to ρ(θ_d, θ_r, e) that is independent of e, enabling us to sum ρ(θ_d, θ_r, e) over the r values of e, corresponding to the r group elements y = [e] g ∈ 𝔾, simply by multiplying by r. This yields

(6) P(θd,θr)≈r22(m+2ℓ)|∑b=02ℓ−1ei(θd−θrd/r)b|2|∑nr=0⌈2m+ℓ/r⌉−1eiθrnr|2=r22(m+2ℓ)|ei2ℓ(θd−θrd/r)−1ei(θd−θrd/r)−1|2|ei⌈2m+ℓ/r⌉θr−1eiθr−1|2

where we furthermore need to assume in (6) that θ_d − θ_rd/r ≠ 0 and θ_r ≠ 0.

This closed-form approximation captures the general characteristics of the probability distribution induced by the quantum algorithm. However, it is seemingly non-trivial to derive a good bound for the error in this approximation.

In what follows, we use techniques similar to those employed above to derive an error-bounded closed-form approximation to ρ(θ_d, θ_r, e) such that the error is negligible in the regions of the plane where the probability mass is concentrated. As was the case above, we will find that the error-bounded approximation of ρ(θ_d, θ_r, e) is independent of e, enabling us to approximate P(θ_d, θ_r) simply by multiplying the closed-form approximation to ρ(θ_d, θ_r, e) by r.

3.1.1 Constructive interference

Before we proceed to develop the closed-form approximation, we note that for a fixed problem instance and fixed e, the sums in ρ(θ_d, θ_r, e) are over a constant number of unit vectors in the complex plane. For such sums, constructive interference arises when all vectors point in approximately the same direction.

In regions of the plane where θ_r and θ_d − θ_rd/r are both small, we hence expect constructive interference to arise. The probability mass is expected to concentrate in regions where constructive interference arises, and where the concentration of pairs (θ_d, θ_r) yielded by the integers pairs (j, k) is great.

In what follows, we therefore seek to derive a closed-form approximation to ρ(θ_d, θ_r, e), and an associated bound on the error in the approximation, such that the error is small when θ_r and θ_d − θ_rd/r are small.

3.2 Closed-form approximation with error bounds

To derive a closed-form approximation to ρ(θ_d, θ_r, e), we first observe that the sums in the expression for ρ(θ_d, θ_r, e) may be regarded as sums over the points in a region ℛ in a lattice L^a,b, as is illustrated in Figure 1. Note that this figure also contains other elements to which we shall return as the analysis progresses.

Figure 1

The lattice L^a,b for σ = 2, m = ℓ = 5, e = 0, r = 31 and d = 27. All red filled points are in ℛ. The region 𝒜 and its translated replicas are drawn as dashed rectangles. All blue outlined points are in 𝒜 or in one of its replicas. The gray triangles outline the points that are in 𝒜 or one of its replicas, but not in ℛ, and vice versa.

Definition 3.1

Let L^a,b be the lattice spanned by (d, 1) and (r, 0) so that the set of points in L^a,b is given by (a − e, b) = b(d, 1) + n_r(r, 0) for integers b and n_r.

Definition 3.2

Let ℛ be the region in L^a,b where 0 ≤ a < 2^m+ℓ and 0 ≤ b < 2^ℓ.

Definition 3.3

Let

Sℛ=|sℛ|222(m+2ℓ) where sℛ=∑(a,b)∈ℛexp[2πi2m+ℓ(aj+2mbk)].

Claim 3.4

The probability ρ(θ_d, θ_r, e) = S_ℛ.

Proof

The points in ℛ are given by (a − e, b) = b(d, 1) + n_r(r, 0), for 0 ≤ b < 2^ℓ and n_r on (2) so that 0 ≤ a = e + bd + n_rr < 2^m+ℓ, which implies that

Sℛ=122(m+2ℓ)|∑b=02ℓ−1 ∑nr=⌈−(e+bd)/r⌉⌈(2m+ℓ−(e+bd))/r⌉−1exp[2πi2m+ℓ(nrrj+b(dj+2mk))]|2=122(m+2ℓ)|∑b=02ℓ−1eiθdb∑nr=⌈−(e+bd)/r⌉⌈(2m+ℓ−(e+bd))/r⌉−1eiθrnr|2=ρ(θd,θr,e)

by the preliminary analysis in Section 3 and so the claim follows.

In what follows, we derive a closed-form approximation to ρ(θ_d, θ_r, e) = S_ℛ, and an associated error bound, in three steps.

3.2.1 Preliminaries

Before proceeding as outlined above, we first introduce some preliminary claims.

Claim 3.5

For u, v ∈ ℂ and Δ = u − v it holds that

||u|2−|v|2|≤2 |u| |Δ|+|Δ|2.

Proof

First verify that

|u|2−|v|2=|u|2−|u−Δ|2=uu¯−(u−Δ)(u−Δ)¯=uu¯−(u−Δ)(u¯−Δ¯)=uΔ¯+u¯Δ−|Δ|2

where the overlines denote complex conjugates. This implies that

||u|2−|v|2|≤|u| |Δ¯|+|u¯| |Δ|+|Δ|2=2 |u| |Δ|+|Δ|2

and so the claim follows.

Claim 3.6

|e^iϕ − 1| ≤ |ϕ| for any ϕ ∈ ℝ.

Proof. It suffices to show that |e^iϕ − 1|² = 2(1 − cos ϕ) ≤ ϕ² from which the claim follows as cos ϕ ≥ 1 − ϕ²/2 for any ϕ ∈ ℝ.

3.2.2 Bounding |s_ℛ|

Before proceeding to the first approximation step, we furthermore bound |s_ℛ| in this section, as this bound is needed in the following analysis.

Lemma 3.7

The sum s_ℛ is bounded by |s_ℛ| ≤ 2^2ℓ+1.

Proof

By Claim 3.4 the sum

sℛ=∑b=02ℓ−1eiθdb∑nr=⌈−(e+bd)/r⌉⌈(2m+ℓ−(e+bd))/r⌉−1eiθrnr

where the outer sum over b is over 2^ℓ values and the inner sum over n_r is over at most 2^ℓ+1 values by Claim 3.8 below. As s_ℛ is a sum of at most 2^2ℓ+1 complex unit vectors, it follows that |s_ℛ| ≤ 2^2ℓ+1, and so the lemma follows.

Claim 3.8

For Δ = ⌈(2^m+ℓ − (e + bd))/r⌉ − ⌈−(e + bd)/r⌉, it holds that

Δ=⌈2m+ℓ/r⌉−t≤2ℓ+1 for some t∈{0,1}.

Proof

For some f₁, f₂ ∈ [0, 1), it holds that

Δ=⌈⌈2m+ℓ/r⌉−f1+⌈−(e+bd)/r⌉−f2⌉−⌈−(e+bd)/r⌉=⌈2m+ℓ/r⌉+⌈−(e+bd)/r⌉−⌈−(e+bd)/r⌉︸=0+⌈−f1−f2⌉=⌈2m+ℓ/r⌉−⌊f1+f2⌋=⌈2m+ℓ/r⌉−t

where t = ⌊f₁ + f₂⌋ ∈ {0, 1} as f₁ + f₂ ∈ [0, 2). Furthermore, recall that r ≥ 2^m−1. Hence, it follows that 2^m+ℓ/r ≤ 2^ℓ+1, so Δ = ⌈2^m+ℓ/r⌉ − t ≤ 2^ℓ+1, and so the claim follows.

3.2.3 Approximating S_ℛ by S_𝒜T_𝒜

In the first approximation step, we approximate S_ℛ by first summing the points in a small region 𝒜 in ℛ, and by then replicating and translating the points in 𝒜, and the associated sum over these points, so as to approximately cover ℛ, see Figure 1.

Definition 3.9

Let 𝒜 be the region in L^a,b where 0 ≤ a < 2^m+ℓ and 0 ≤ b < 2^σ for σ an integer parameter selected on 0 < σ < ℓ.

Definition 3.10

Let

SA=|sA|222(m+2ℓ) where sA=∑(a,b)∈Aexp[2πi2m+ℓ(aj+2mbk)].

Claim 3.11

SA=122(m+2ℓ)|∑b=02σ−1eiθdb∑nr=⌈−(e+bd)/r⌉⌈(2m+ℓ−(e+bd))/r⌉−1eiθrnr|2.

Proof

The points in 𝒜 are given by (a − e, b) = b(d, 1) + n_r(r, 0) for 0 ≤ b < 2^σ and n_r on (2) so that 0 ≤ a = e + bd + n_rr < 2^m+ℓ, which implies that

SA=122(m+2ℓ)|∑b=02σ−1 ∑nr=⌈−(e+bd)/r⌉⌈(2m+ℓ−(e+bd))/r⌉−1exp[2πi2m+ℓ(nrrj+b(dj+2mk))]|2=122(m+2ℓ)|∑b=02σ−1eiθdb∑nr=⌈−(e+bd)/r⌉⌈(2m+ℓ−(e+bd))/r⌉−1eiθrnr|2

in analogy with the analysis in Section 3, but with b on 0 ≤ b < 2^σ as opposed to 0 ≤ b < 2^ℓ, and so the claim follows.

To replicate and translate the points in 𝒜 so as to approximately cover ℛ, we furthermore introduce t_𝒜 and T_𝒜, as defined below:

Definition 3.12

Let

TA=|tA|2 where tA=∑t=02ℓ−σ−1ei(θd2σ+θr⌈−2σd/r⌉) t.

The error when approximating S_ℛ by S_𝒜T_𝒜 may now be bounded as follows:

Lemma 3.13

The error when approximating s_ℛ by s_𝒜t_𝒜 is bounded by

|sℛ−sAtA|≤22ℓ−σ+1.

Proof

The exponential sum t_𝒜 replicates and translates the partial sum over 𝒜 so as to approximately cover ℛ as is illustrated in Figure 1. Every time the region is replicated, it is translated by a vector in L^a,b that corresponds to e^{i(θ_d 2^σ + θ_r⌈−2^σd/r⌉)}.

The error that arises when s_ℛ is approximated by s_𝒜t_𝒜 is hence due to points that are in ℛ but excluded from the sum, and conversely to points not in ℛ that are erroneously included in the sum. Hereinafter these points will be referred to as the erroneous points.

The erroneous points fall within the two gray triangles in Figure 1. Both triangles are of horizontal side length 2^ℓ and vertical side length 2^ℓ−σ (2^σd mod r), as the region 𝒜 is replicated and translated 2^ℓ−σ times in total, and as it is shifted horizontally by 2^σ and vertically by 2^σd mod r every time it is translated.

To upper-bound the number of lattice points in each triangle, note that the lattice points are on 2^ℓ vertical lines, evenly separated horizontally by a distance of one. The points on each vertical line are evenly separated vertically by a distance of r, with varying starting positions on each line. For h(b) = 2^ℓ−σ (2^σd mod r)(b/2^ℓ) the height of each triangle at b, we have that at most

N(b)=1+⌊h(b)r⌋≤1+h(b)r=1+2σd modrrb2σ≤1+b2σ

lattice points are then on the vertical line that cuts through the triangle at b, as may be seen by maximizing over all possible starting points. By summing N(b) over all 2^ℓ lines, we thus obtain an upper bound of

∑b=02ℓ−1N(b)≤2ℓ+12σ∑b =02ℓ−1b=2ℓ+12σ2ℓ(2ℓ−1)2≤22ℓ−σ

on the number of points in each triangle, where we have used that 2^{2ℓ−σ−1} ≥ 2^ℓ as σ is an integer on 0 < σ < ℓ.

As there are two triangles, the total number of erroneous points is upper-bounded by 2 · 2^2ℓ−σ = 2^2ℓ−σ+1. Each erroneous point corresponds to a unit vector in the complex sum s_ℛ − s_𝒜t_𝒜, which implies |s_ℛ − s_𝒜t_𝒜| ≤ 2^2ℓ−σ+1, and so the lemma follows.

Lemma 3.14

The error when approximating S_ℛ by S_𝒜T_𝒜 is bounded by

|Sℛ−SATA|≤2−2m−σ+4.

Proof

By Claim 3.5, it holds that

| |sℛ|2−|sAtA|2 |≤2 |sℛ| |sℛ−sAtA|+|sℛ−sAtA|2≤2⋅22ℓ+1⋅22ℓ−σ+1+24ℓ−2σ+2≤3⋅24ℓ−σ+2≤24(ℓ+1)−σ

as |s_ℛ − s_𝒜t_𝒜| ≤ 2^2ℓ−σ+1 by Lemma 3.13 and |s_ℛ| ≤ 2^2ℓ+1 by Lemma 3.7.

From the above, and Definitions 3.3, 3.10 and 3.12, we have that

|Sℛ−SATA|=| |sℛ|2−|sAtA|2 |22(m+2ℓ)≤24(ℓ+1)−σ22(m+2ℓ)=2−2m−σ+4

and so the lemma follows.

As t_𝒜 is a geometric sum T_𝒜 = |t_𝒜|² may be placed on closed form. It remains to derive a closed-form approximation to S_𝒜. In what follows, we do this in two additional approximation steps.

3.2.4 Approximating S_𝒜 by SA′

In the second approximation step, we derive a closed-form approximation to S_𝒜, by first approximating S_𝒜 by the product SA′ of two sums, such that the leading sum may be placed on closed form, and such that the trailing sum may be placed on closed form by means of a third approximation step.

Definition 3.15

Let

SA′=|sA′|222(m+2ℓ) where sA′=∑b=02σ−1ei(θdb+θr⌈−(e+bd)/r⌉)∑nr=0⌈2m+ℓ/r⌉−1eiθrnr.

Lemma 3.16

The error when approximating s_𝒜 by sA′ is bounded by

|sA−sA′|≤2σ.

Proof

As s_𝒜 and sA′ are sums of complex unit vectors, and as the sums differ by at most 2^σ vectors, as may be seen by comparing the summation intervals using Claim 3.8, it follows that |sA−sA′|≤2σ , and so the lemma follows.

Lemma 3.17

The sum sA′ is bounded by |sA′|≤2ℓ+σ+1 .

Proof

In the expression for sA′ in Definition 3.15, the sum over b assumes 2^σ values and the sum over n_r assumes at most 2^ℓ+1 values as the order r ≥ 2^m−1. As sA′ is a sum of at most 2^ℓ+σ+1 complex unit vectors, it follows that |sA′|≤2ℓ+σ+1 , and so the lemma follows.

Lemma 3.18

The error when approximating S_𝒜 by SA′ is bounded by

|SA−SA′|≤2−2m−3ℓ+2σ+3.

Proof

By Claim 3.5, it holds that

||sA|2−|sA′|2|≤2|sA′||sA−sA′|+|sA−sA′|2≤2⋅2ℓ+σ+1⋅2σ+22σ≤3⋅2ℓ+2σ+1≤2ℓ+2σ+3

as |sA−sA′|≤2σ by Lemma 3.16 and |sA′|≤2ℓ+σ+1 by Lemma 3.17.

From the above, and Definitions 3.10 and 3.15, we have that

|SA−SA′|=| |sA|2−|sA′|2|22(m+2ℓ)≤2ℓ+2σ+322(m+2ℓ)=2−2m−3ℓ+2σ+3

and so the lemma follows.

The trailing sum in sA′ is geometric. Hence, it may be trivially placed on closed form. Due to the rounding operation in the exponent, this approach is not valid for the leading sum; we need a third approximation step.

3.2.5 Approximating SA′ by S SA″

For θ_d and θ_r such that the angles θ_db + θ_r − ⌈(e + bd)/r⌉ ≈ (θ_d − θ_rd/r) b in the leading sum in sA′ are small for all b on 0 ≤ b < 2^σ, all 2^σ terms in the sum are approximately one. In the third and final step of the approximation, we bound the error when simply approximating all terms in the leading sum by one.

Definition 3.19

Let

SA″=|sA″|222(m+2ℓ) where sA″=2σ∑nr=0⌈2m+ℓ/r⌉−1eiθrnr.

Lemma 3.20

The difference between sA′ and sA″ is bounded by

|sA′−sA″|≤2σ−1(|θd|+|θr|) |sA″|.

Proof

First observe that

|sA′−sA″|=|∑b=02σ−1(ei(θdb+θr⌈−(e+bd)/r⌉)−1)|︸|Δ||∑nr=0⌈2m+ℓ/r⌉−1eiθrnr|.

By using Claim 3.6 and the triangle inequality, it follows that

|Δ|=|∑b=02σ−1(ei(θdb+θr⌈−(e+bd)/r⌉)−1)|≤∑b=02σ−1|ei(θdb+θr⌈−(e+bd)/r⌉)−1|≤∑b=02σ−1|θdb+θr⌈−(e+bd)/r⌉|=∑b=02σ−1|θdb−θr⌊(e+bd)/r⌋|≤(|θd|+|θr|)∑b=02σ−1b≤(|θd|+|θr|) 2σ(2σ−1)2≤22σ−1(|θd|+|θr|)

where we use that ⌈−x⌉ = − ⌊x⌋ and ⌊(e + bd)/r⌋ ≤ b. To verify the latter claim, note that f₁ = e/r ∈ [0, 1) and f₂ = bd/r ∈ [0, b) as e, d ∈ [0, r). This implies that ⌊(e + bd)/r⌋ = ⌊f₁ + f₂⌋ ∈ [0, b] as f₁ + f₂ ∈ [0, b + 1).

By combining the above results, we now have that

|sA′−sA″|≤22σ−1(|θd|+|θr|)|∑nr=0⌈2m+ℓ/r⌉−1eiθrnr|=2σ−1(|θd|+|θr|)|sA″|

and so the lemma follows.

Lemma 3.21

The error when approximating SA′ by SA″ is bounded by

|SA′−SA″|≤2σ−1(|θd|+|θr|)(2+2σ−1(|θd|+|θr|))SA″.

Proof

By Claim 3.5, it holds that

| |sA′|2−|sA″|2 |≤2|sA″||sA′−sA″|+|sA′−sA″|2≤2⋅2σ−1(|θd|+|θr|)|sA″|2+22(σ−1)(|θd|+|θr|)2|sA″|2=2σ−1(|θd|+|θr|)(2+2σ−1(|θd|+|θr|))|sA″|2

as |sA′−sA″|≤2σ−1(|θd|+|θr|) |sA″| by Lemma 3.20.

From the above, and Definitions 3.15 and 3.19, we have that

|SA′−SA″|=||sA′|2−|sA″|2|22(m+2ℓ)≤2σ−1(|θd|+|θr|)(2+2σ−1(|θd|+|θr|))SA″

and so the lemma follows.

This yields an approximation SA″ to SA′ that may be placed on closed form.

3.2.6 Main approximability result

By combining the previous results, the main approximability result follows:

Theorem 3.22

The probability P(θ_d, θ_r) of observing a specific pair (j, k) with angle pair (θ_d, θ_r), summed over all y ∈ 𝔾, may be approximated by

assuming θ_d2^σ + θ_r ⌈−2^σd/r⌉ ≠ 0 and θ_r ≠ 0 when placing the expression on closed form. The approximation error |P(θd,θr)−P˜(θd,θr)|≤e˜(θd,θr) where

e˜(θd,θr)=242m+σ+232m+ℓ+2σ2(|θd|+|θr|)(2+2σ2(|θd|+|θr|))P˜(θd,θr).

Proof

The probability ρ(θ_d, θ_r, e) of observing a specific pair (j, k), with angle pair (θ_d, θ_r), and some group element y = [e] g ∈ 𝔾, is S_ℛ by Claim 3.4.

The error when approximating S_ℛ by S_𝒜T_𝒜 is bounded by

|Sℛ−SATA|≤2−2m−σ+4

by Lemma 3.14. The error when approximating S_𝒜T_𝒜 by SA′TA is bounded by

|SATA−SA′TA|≤2−2m−3ℓ+2σ+3TA

by Lemma 3.18. The error when approximating S′ATA by SA″TA is bounded by

|SA′TA−SA″TA|≤2σ−1(|θd|+|θr|)(2+2σ−1(|θd|+|θr|)) SA″TA

by Lemma 3.21. By the triangle inequality

|Sℛ−SA″TA|=|(Sℛ−SATA)+(SATA−SA′TA)+(SA′TA−SA″TA)|≤|Sℛ−SATA|+TA |SA−SA′|+TA |SA′−SA″|.

Neither of these three error terms, nor the expression for SA″TA , depend on e. Hence, we may sum over all r elements y = [e] g ∈ 𝔾 by multiplying by r. It therefore follows that P˜(θd,θr)=rSA″TA is an approximation to P(θ_d, θ_r), and that the error that arises in this approximation is bounded by

r | Sℛ−SATA |+rTA | SA−SA′ |+rTA | SA′−SA″ | ≤2−2m−σ+4 r+2−2m−3ℓ+2σ+3 rTA + 2σ−1(| θd |+| θr |) (2+2σ−1(| θd |+| θr |)) rSA″TA ≤242m+σ+232m+ℓ+2σ2(| θd |+| θr |)(2+2σ2(| θd |+| θr |))P˜(θd,θr)

where we use that r < 2^m, and that T_𝒜 ≤ 2^2(ℓ−σ) as it is the square norm of a sum of 2^ℓ−σ unit vectors by Definition 3.12, and so the theorem follows.

In Appendix C we demonstrate the soundness of this approximation.

4 The distribution of pairs (α_d, α_r)

In this section, we identify and count all pairs (j, k) that yield (α_d, α_r) and analyze the distribution and density of pairs (α_d, α_r) in the plane.

Definition 4.1

An argument pair (α_d, α_r) is said to be admissible if there exists an integer pair (j, k), for j on 0 ≤ j < 2^m+ℓ and k on 0 ≤ k < 2^ℓ, such that

αd={dj+2mk}2m+ℓ and αr={rj}2m+ℓ.

Definition 4.2

Let κ_d denote the greatest integer such that 2^κ_d divides d, and let κ_r denote the greatest integer such that 2^κ_r divides r.

Definition 4.3

Let L^α be the lattice generated by the rows in

[δr2κr2m−γ0] where δr=d(r2κr)−1 mod 2m−γ

and γ = max(0, κ_r − (ℓ + κ_d)).

Lemma 4.4

The admissible argument pairs (α_d, α_r) are vectors in L^α in the region of the plane where −2^m+ℓ−1 ≤ α_d, α_r < 2^m+ℓ−1. There are 2^{m+2ℓ−κ_r+γ} distinct admissible argument pairs, each occurring with multiplicity 2^κ_r−γ.

Proof

As α_r ≡ rj (mod 2^m+ℓ), the set of integers j that yield α_r are given by

j≡αr2κr(r2κr)−1+2m+ℓ−κrtr (mod 2m+ℓ)

as t_r runs through all integers on 0 ≤ t_r < 2^κ_r. As α_d ≡ dj + 2^mk (mod 2^m+ℓ), we need

(7) αd≡d (αr2κr(r2κr)−1+2m+ℓ−κrtr)+2mk≡αr2κrd(r2κr)−1+2m+ℓ−κr+κd dtr2κd︸A+ 2mk ︸B (mod 2m+ℓ)

for k an integer on 0 ≤ k < 2^ℓ, to ensure compatibility. As 2^m−γ is the largest power of two to divide both 2^m and 2^{m+ℓ−κ_r+κ_d}, by the definition of γ, the congruence relation α_d ≡ (α_r/2^κ_r)d (r/2^κ_r)⁻¹ (mod 2^m−γ) must hold.

As t_r and k run through all pairwise combinations, the set of 2^ℓ+κ_r arguments α_d generated by (7) is equal to that generated by

(8) αd≡αr2κrd(r2κr)−1+2m−γtγ

(9) ≡αr2κr(d(r2κr)−1mod 2m−γ)+2m−γtγ′ (mod 2m+ℓ)

as t_γ, or equivalently tγ′ , runs through all integers on 0 ≤ t_γ, tγ′<2ℓ+κr .

To go from (7) to (8), first note that B runs through all values in [2^m, 2^m+ℓ). If γ = 0, term A introduces multiplicity by repeating the sequence generated by B with various offsets. These offsets are of no significance to this analysis, as we only account for which values occur in the set and with what multiplicity.

If γ > 0, term A runs through all values in [2^m−γ, 2^m−γ+κ_r). As κ_r ≥ γ when γ > 0, term A runs through all values in the subrange [2^m−γ, 2^m). When A assumes values greater than or equal to 2^m, it introduces multiplicity by repeating the sequence of all values on [2^m−γ, 2^m+ℓ) generated by A and B with various offsets.

This implies that (A + B) mod 2^m+ℓ runs through all 2^m+ℓ/2^m−γ = 2^ℓ+γ values on [2^m−γ, 2^m+ℓ) with multiplicity 2^ℓ+κ_r/2^ℓ+γ = 2^κ_r−γ, and this is exactly what is stated in (8). To go from (8) to (9) is trivial.

As there are 2^m+2ℓ admissible argument pairs, and as each pair occurs with multiplicity 2^κ_r−γ, there are 2^{m+2ℓ−κ_r+γ} distinct admissible argument pairs.

The lattice L^α is constructed from (9), as the admissible arguments α_r are multiples of 2^κ_r, and as the admissible αd≡(αr / 2κr) δr+2m+γtγ′ (mod 2m+ℓ) , in the region of the plane where −2^m+ℓ−1 ≤ α_d, α_r < 2^m+ℓ−1, and so the lemma follows.

In Figure 2 the distribution of admissible argument pairs (α_d, α_r) in the region of the plane where −2^m+ℓ−1 ≤ α_d, α_r < 2^m+ℓ−1 is depicted for various combinations of d and r.

Figure 2

The distribution of admissible argument pairs (α_d, α_r) in the region where −2^m+ℓ−1 ≤ α_d, α_r < 2^m+ℓ−1 for m = 4 and ℓ = 3, and example combinations of d and r, as indicated. The lattice may be constructed by replicating the fundamental parallelogram (blue) or a rectangle (gray) of size 2^m−γ × 2^m+κ_r−γ.

4.1 Pairs (j, k) yielding (α_d, α_r)

In this section we identify all pairs (j, k) that yield (α_d, α_r).

Lemma 4.5

The set of integer pairs (j, k), for j on 0 ≤ j < 2^m+ℓ and k on 0 ≤ k < 2^ℓ, that yield the admissible argument pair (α_d, α_r) is given by

j=(αr2κr(r2κr)−1+2m+ℓ−κr tr) mod 2m+ℓ and k=αd−dj2m mod 2ℓ

as t_r runs through all integer multiples of 2^γ on 0 ≤ t_r < 2^κ_r.

Proof

As α_r ≡ rj (mod 2^m+ℓ), solving for j yields

j=(αr2κr(r2rκ)−1+2m+ℓ−κr tr) mod 2m+ℓ

for t_r an integer 0 ≤ t_r < 2^κ_r.

As α_d ≡ dj + 2^mk (mod 2^m+ℓ), for compatibility 2^m must divide 2^m+ℓ−κ_r dt_r for all t_r ≠ 0. As 2^{m+ℓ+κ_d−κ_r} is the greatest power of two to divide 2^m+ℓ−κ_r d, it follows that t_r must be a multiple of 2^γ, and so the lemma follows.

4.2 The density of pairs (α_d, α_r)

In this section we analyze the density of admissible argument pairs (α_d, α_r) in the argument plane.

Claim 4.6

The density of admissible argument pairs (α_d, α_r) in the region of the plane where −2^m+ℓ−1 ≤ α_d, α_r < 2^m+ℓ−1 is 2^−m when accounting for multiplicity.

Proof

There are 2^m+2ℓ admissible argument pairs (α_d, α_r), when accounting for multiplicity, in the region of the plane where −2^m+ℓ−1 ≤ α_d, α_r < 2^m+ℓ−1. This region is of area 2^2(m+ℓ). The density is hence 2^m+2ℓ/2^2(m+ℓ) = 2^−m, and so the claim follows.

To construct the histogram for the probability distribution, the argument plane is divided into small rectangular subregions. Lemma 4.7 below bounds the error when approximating the density in such subregions by 2^−m.

Lemma 4.7

Let D be the density of admissible argument pairs (α_d, α_r), when accounting for multiplicity, in a rectangle R of area A and circumference C in the region of the plane where −2^m+ℓ−1 ≤ α_d, α_r < 2^m+ℓ−1. Then

|D−12m|≤2κr−γ 2Cλ2+4 (2λ2)2A detLα=2Cλ2+4 (2λ2)22mA

for λ₁ the norm of the shortest non-zero vector w₁ ∈ L^α, and λ₂ the norm of the shortest non-zero vector w₂ ∈ L^α that is linearly independent to w₁.

Proof

By Lemma 4.4, the admissible argument pairs (α_d, α_r) are vectors in L^α in the region of the argument plane where −2^m+ℓ−1 ≤ α_d, α_r < 2^m+ℓ−1. Each admissible argument pair occurs with multiplicity 2^κ_r−γ.

The fundamental parallelogram in L^α contains a single lattice vector. It is spanned by w₁ and w₂, and has area det L^α = λ₂ |w_⊥| = 2^m+κ_r−γ, where w_⊥ is the component in w₁ perpendicular to w₂. This implies λ₂ ≥ λ₁ ≥ |w_⊥|. To bound the number of argument pairs (α_d, α_r) ∈ R, we lower- and upper-bound the number of fundamental parallelograms that can at most fit into R, as described below, paying particular attention to the border areas:

To upper-bound the number of vectors in R, we extend each side of R by 2 λ₂ length units, to ensure that any parallelogram that is only partly in R is included in the count, and divide the area of the resulting rectangle by the area of the fundamental parallelogram. This yields (A + 2Cλ₂ + 4 (2 λ₂)²) / det L^α.

Conversely, to lower-bound the number of vectors in R, we retract each side of R by 2λ₂ length units, to ensure that all parallelograms that are only partly in the rectangle are excluded from the count, and divide the area of the resulting rectangle by det L^α. This yields (A − 2Cλ₂ + 4 (2λ₂)²) / det L^α.

By combining the upper and lower bounds, dividing by the area A of R, and multiplying by 2^κ_r−γ to account for multiplicity, the lemma follows.

For known d and r, Lemma 4.7 above provides a bound on the error when approximating the density in a rectangle in L^α by 2^−m as λ₂ may then be computed. To bound the error for general problem instances, and when d and r are unknown, we introduce the following less tight lemma:

Lemma 4.8

Let D be the density of admissible argument pairs (α_d, α_r), when accounting for multiplicity, in a rectangle of side lengths l_d and l_r in the α_d and α_r directions, respectively, in the region of the plane where −2^m+ℓ−1 ≤ α_d, α_r < 2^m+ℓ−1. Then

|D−12m|≤2κr2mlr+12γld+1ldlr.

Proof

By Lemma 4.4, the admissible argument pairs (α_d, α_r) are vectors in L^α in the region of the plane where −2^m+ℓ−1 ≤ α_d, α_r < 2^m+ℓ−1. Each admissible argument pair occurs with multiplicity 2^κ_r−γ.

The vectors in L^α are on horizontal lines (for fixed α_r) evenly separated by a vertical distance of 2^κ_r. The number of such lines that intersect the rectangle is upper-bounded by ⌊l_r/2^κ_r⌋ + 1 ≤ l_r/2^κ_r + 1 and lower-bounded by ⌊l_r/2^κ_r⌋ ≥ l_r/2^κ_r − 1 as may be seen by positioning the rectangle to maximize or minimize the number of lines that intersect the rectangle.

On each line, the vectors in L^α are evenly spaced by a distance of 2^m−γ with varying starting positions. The number of vectors in L^α that fall within the rectangle on each line is upper-bounded by ⌊l_d/2^m−γ⌋ + 1 ≤ l_d/2^m−γ + 1 and lower-bounded by ⌊l_d/2^m−γ⌋ ≥ l_d/2^m−γ − 1, when not accounting for multiplicity, as may be seen by positioning the line to maximize or minimize the number of vectors that fall within the rectangle.

Hence the number of lattice vectors in the rectangle is upper-bounded by

2κr−γ(lr/2κr+1)(ld/2m−γ+1)=ldlr/2m+ld2κr/2m+lr/2γ+1

and lower-bounded by

2κr−γ(lr/2κr−1)(ld/2m−γ−1)=ldlr/2m−ld2κr/2m−lr/2γ+1

as each vector corresponds to a pair that occurs with multiplicity 2^κ_r−γ. By combining these bounds, and dividing by the rectangle area l_dl_r, the lemma follows.

For unknown d and r, Lemma 4.8 above provides an error bound, assuming only some bounds on the parameters κ_r and γ. Asymptotically, the error in the approximation tends to zero as the side lengths of the rectangle tend to infinity. For rectangular subregions of specific dimensions, it may furthermore be shown that the error is zero, as is demonstrated in the following lemma:

Lemma 4.9

The density of admissible argument pairs (α_d, α_r) in a rectangle of side lengths positive integer multiples of 2^m−γ and 2^m−γ+κ_r in α_d and α_r, respectively, in the region of the plane where −2^m+ℓ−1 ≤ α_d, α_r < 2^m+ℓ−1, is 2^−m when accounting for multiplicity.

Proof

From the definition of L^α in Lemma 4.4, it follows that the lattice is cyclic with period 2^m−γ in α_d and 2^m−γ+κ_r in α_r. This is illustrated in Figure 2 where rectangular regions of these dimensions are highlighted in gray. The highlighted regions all extend from the origin in Figure 2 but the starting point may of course be arbitrarily selected. This implies that the lattice L^α may be generated by replicating and translating any rectangle of side lengths positive multiples of 2^m−γ and 2^m−γ+κ_r in α_d and α_r, respectively, see Figure 2, throughout the plane. The same holds if the rectangle is replicated and translated cyclically throughout the region of the plane where −2^m+ℓ−1 ≤ α_d, α_r < 2^m+ℓ−1.

The number of rectangles that fit in the region when replicated and translated cyclically is

22(m+ℓ)/22(m−γ)+κr=22(ℓ+γ)−κr

as the area of the region is 2^2(m+ℓ) and the area of the rectangle is 2^{2(m−γ)+κ_r}. The total number of lattice vectors in the region is 2^2m+ℓ, so each rectangle contains 2^m+2ℓ/2^{2(ℓ+γ)−κ_r} = 2^m−2γ+κ_r vectors when accounting for multiplicity. By dividing by the area of the rectangle, we see that the density of points in each rectangle is 2^m−2γ+κ_r/2^{2(m−γ)+κ_r} = 2^−m, and so the lemma follows.

5 Simulating the quantum algorithm

In close analogy with [7], we now proceed to construct a high-resolution histogram for the probability distribution induced by the quantum algorithm, for given d and r, and to sample it to simulate the quantum algorithm.

5.1 Constructing the histogram

Except for the fact that the probability distribution is two-dimensional, and that we need to account for the closed-form expression being an approximation, we exactly follow [7] to construct the high-resolution histogram: We subdivide the argument plane into regions and subregions, and integrate the closed-form probability approximation P˜(θd,θr) and associated error bound e˜(θd,θr) numerically in each subregion.

First, we subdivide each quadrant of the argument plane into (30 + μ)² rectangular regions where μ = min(ℓ − 2, 11). Each region thus formed is uniquely identified by (η_d, η_r) ∈ ℤ² by requiring that for all (α_d, α_r) in the region

2|ηd|≤|αd|≤2|ηd|+1 and 2|ηr|≤|αr|≤2|ηr|+1,

and furthermore that sgn(α_d) = sgn(η_d) and sgn(α_r) = sgn(η_r), where η_d and η_r are such that

m−30≤|ηd|,|ηr|≤m+μ−1,

see the illustration in Figure 3.

Figure 3

The subdivision of the plane into regions and subregions. The gray box illustrates Simpson's rule applied to a subregion. The probability is computed in the blue corner points, the four red border mid-points and the red center-point.

Then, we subdivide each region into rectangular subregions identified by an integer pair (ξ_d, ξ_r) by requiring that for all (α_d, α_r) in the subregion

2|ηd|+ξd/2ν≤|αd|≤2|ηd|+(ξd+1)/2ν and 2|ηr|+ξr/2ν≤|αr|<2|ηr|+(ξr+1)/2ν

where 0 ≤ ξ_d, ξ_r < 2^ν for ν ∈ {6, 7, 8, 9} a resolution parameter adaptively selected as a function of the probability mass and variance in each region.

For each subregion, we compute the approximate probability mass contained within the subregion, and an associated error bound, by applying Simpson's rule in two dimensions, followed by Richardson extrapolation to cancel the linear error term, and division by 2^m to account for the density of pairs.

Simpson's rule is hence applied 2^2ν (1 + 2²) times in each region. Each application requires P˜(θd,θr) and e˜(θd,θr) to be evaluated in up to nine points, for which purpose we use the closed-form expressions in Theorem 3.22, with σ adaptively selected to suppress e˜(θd,θr) .

The optimal σ may be found by searching exhaustively. A computationally more efficient method for selecting σ is to use the heuristic in Appendix C.5.3. We use the heuristic in all cases except when s is large in relation to m causing the error in the closed-form approximation to be large. For such m and s we accept an extra computational burden to get slightly better σ and slightly smaller errors.

In what follows, we refer to the total probability mass captured as the sum of the integral of P˜(θd,θr) over all subregions, and to the total approximation error as the sum of the integral of e˜(θd,θr) over all subregions. Note that the total approximation error is an upper bound, by this definition, that is by no means tight. The actual error in the approximation is likely smaller than the bound indicates.

In order to save space when storing the histogram, we discard regions that capture insignificant shares of the probability mass. Note furthermore that for m and s such that the total approximation error is large, the error may often be reduced at the expense of capturing a smaller fraction of the probability mass by simply discarding selected regions where the error is large. The errors we report in this paper are without accounting for such additional filtering.

Note that this method of constructing the histogram assumes κ_d and κ_r to be small in relation to m. Note also that it follows from Section 4.2 that it is sound to approximate the density by 2^−m in the four regions of interest in the plane. For the m and s that we consider, the error in the density approximation is negligible.

5.2 Understanding the probability distribution

To illustrate the distribution that arises, a histogram is plotted in the signed logarithmic argument plane in Figure 4 for m = 2048 and s = 30, and for d and r selected as explained in Section 7.3. It captures approximately 99.99% of the probability mass. The total approximation error is less than 10⁻³.

Figure 4

The probability distribution for general discrete logarithms computed as in Section 5.1 for m = 2048, s = 30, and d and r selected as in Section 7.3. To facilitate printing, the resolution has been reduced in this figure.

The histogram plotted in Figure 4 captures the characteristic appearance of the probability distribution when d and r are both of length m bits and not divisible by large powers of two.

The probability mass is located in the regions where (|α_d|, |α_r|) ~ (2^m, 2^m), whereas for random pairs (j, k) the arguments would both be of size ~ 2^m+ℓ. This implies that a single run of the quantum algorithm yields ~ ℓ ~ m/s bits of information on d and r, respectively.

The distribution is symmetric, in that the top right and lower left quadrants are mirrored, as are the top left and lower right quadrants. It hence suffices to compute only two quadrants to construct the histogram. To see why this is, note that flipping the sign of both arguments in the expression for P˜(θd,θr) in Theorem 3.22 has no effect. Flipping the sign of only one argument, on the other hand, may lead to cancellation or lack of cancellation in the angle θ_d2^σ + θ_r ⌈−2^σd/r⌉. This explains the concentration of probability mass in the top right and lower left quadrants, and in the tail that extends along the diagonal in Figure 4 where θ_d2^σ + θ_r ⌈−2^σd/r⌉ is small.

If the size of d in relation to r is gradually reduced, the tail along the diagonal gradually disappears, as the cancellation effect weakens when d/r is reduced in size. When d is approximately of length m − 5 bits, the tail is nearly gone, and all four quadrants are nearly mirrored. Further reducing d has no significant effect on the distribution. Varying r on the interval 2^m−1 < r < 2^m only slightly affects the distribution. Scaling m and s has virtually no effect on the distribution, for as long as m/s does not become so small so as to cause constructive interference not to arise, or the plot to be cropped.

The marginal distribution along the α_r axis in Figure 4 is virtually identical to the distribution induced by r when performing order finding, see Appendix A and Figure A1 for comparison. In Appendix D we prove this correspondence analytically by summing P˜(θd,θr) over all admissible angles θ_d with multiplicity. Analogously, the marginal distribution along the α_d axis is seemingly virtually identical to the probability distribution induced by d when regarded as a short discrete logarithm, see [7] and Figure 5 for comparison. We have not as of yet been able to prove this correspondence analytically but it may be evidenced numerically by comparing the distributions for specific problem instances.

Figure 5

The probability distribution for short discrete logarithms computed as in Appendix B from the closed-form expression in [7], for m = 2048 and s = 30, and d selected as in Section 7.3. The resolution has been reduced in this figure.

The above observations imply that the lattice-based post-processing algorithm introduced in [7] may be used to solve sets of pairs (j, k) for both short and general d, with minor modifications, see Section 6.1. An analogous lattice-based algorithm may be used to solve sets of integers j for r, see Section 6.2. The hardest instances to solve are those where d is large in relation to r, and r is large in relation to 2^m, as in Figure 4, due to the tail that then extends along the diagonal.

5.3 Sampling the probability distribution

Except for the fact that the probability distribution is two-dimensional, we exactly follow [7] to sample the distribution: To sample an argument pair (α_d, α_r), we first sample a subregion and then sample (α_d, α_r) from this subregion.

To sample the subregion, we first order all subregions in the histogram by probability, and compute the cumulative probability up to and including each subregion in the resulting ordered sequence. Then, we sample a pivot uniformly at random from [0, 1), and return the first subregion in the ordered sequence for which the cumulative probability is greater than or equal to the pivot. Note that this procedure may fail: This occurs if the pivot is greater than the total cumulative probability.

To sample an argument pair (α_d, α_r) from the subregion, we first sample a point (αd′,αr′)∈ℤ2 uniformly at random from the subregion. Then, we map (αd′,αr′) to the closest admissible argument pair (α_d, α_r) ∈ L^α by reducing the basis for L^α given in Definition 4.3 and applying Babai's algorithm [1].

To sample an integer pair (j, k) from the distribution, we first sample (α_d, α_r) as described above, and then sample (j, k) uniformly at random from the set of all integer pairs (j, k) yielding (α_d, α_r) using Lemma 4.5. More specifically, we first sample an integer t_r uniformly at random from the set of all admissible values for t_r and then compute (j, k) from (α_d, α_r) and t_r as described in Lemma 4.5.

6 The classical post-processing algorithms

In this section, we describe how d and r are classically recovered from a set {(j₁, k₁), . . ., (j_n, k_n)} of pairs produced by performing n independent runs of the quantum algorithm.

6.1 Recovering d from a set of n pairs

To recover d, we exactly follow [7], and use the set of n pairs to form a vector

vdk=({−2mk1}2m+ℓ, …, {−2mkn}2m+ℓ, 0)∈ℤD

and a D-dimensional integer lattice L^j with basis matrix

[j1j2⋯jn12m+ℓ0⋯0002m+ℓ⋯00⋮⋮⋱⋮⋮00⋯2m+ℓ0]

where D = n + 1. For some constants m₁, . . ., m_n ∈ ℤ, the vector

udj=({dj1}2m+ℓ+m12m+ℓ, …, {djn}2m+ℓ+mn2m+ℓ, d)∈Lj

is such that the distance

Rd=|udj−vdk|=∑i=1n({dji}2m+ℓ+mi2m+ℓ−{−2mki}2m+ℓ)2+d2=∑i=1n{dji+2mki}2m+ℓ2︸αd,i2+d2=∑i=1nαd,i2+d2.

To recover d, it hence suffices to find udj by enumerating all vectors in L^j within a D-dimensional hypersphere of radius R_d centered on vdk . Its volume is

VD(Rd)=πD/2Γ(D2+1)RdD

where Γ is the Gamma function, whilst the fundamental parallelepiped in L^j, that by definition contains a single lattice vector, is of volume det L^j = 2^(m+ℓ)n.

Heuristically, the hypersphere is hence expected to contain approximately v_d = V_D(R_d)/det L^j lattice vectors. The exact number depends on the placement of the hypersphere in ℤ^D, and on the shape of the fundamental parallelepiped in L^j.

6.1.1 Estimating the minimum n required to solve for d

The radius R_d depends on (j_i, k_i) via α_d,i for 1 ≤ i ≤ n. For fixed n and probability q_d, we exactly follow [7] and estimate the minimum radius R˜d such that

(10) Pr[Rd:=∑i=1nαd,i2+d2≤R˜d]≥qd

by sampling α_d,i from the probability distribution. For details on how the estimate is computed, see Section 6.3. It follows from (10) that

(11) Pr[vd:=VD(Rd)detLj≤VD(R˜d)2(m+ℓ)n]≥qd.

This provides a heuristic bound on the number of lattice vectors v_d that at most have to be enumerated to solve for d, and that holds with probability at least q_d.

6.1.2 Selecting n and solving for d

A simple strategy when solving for d is to select n as described in Section 6.1.1 such that v_d is below a bound equal to the maximum number of vectors that it is computationally feasible to enumerate with probability q_d. This strategy minimizes n at the expense of performing potentially computationally expensive post-processing.

Another strategy is to select n such that v_d < 2 with probability q_d, so that there is only one vector in the hypersphere by the heuristic. In theory, this enables us to find udj with probability q_d by mapping vdk to the closest vector in L^j without enumerating vectors in L^j. In practice, however, the situation is a bit more complicated as urj=({rj1}2m+ℓ, …, {rjn}2m+ℓ, r)∈Lj and this vector is short in L^j by construction. To further complicate matters, urj/z may be in L^j when r is composite, for z some factor of r, see Section 6.2.3. To recover udj , we therefore first map vdk to the closest vector in L^j, and then add or subtract small integer multiples of the shortest vector in the reduced basis for L^j to find udj . This is efficient, except if r has very many small prime factors, and n is close to one, in which case an additional classical post-processing step may be required, see Section 6.2.5.

Note that this complication arises only for general discrete logarithms. It does not arise in [7] when post-processing short discrete logarithms, as the order then does not enter into the equation. Note furthermore that the fact that the order now does play a part may be leveraged in the post-processing, see the next sections.

6.1.3 Selecting n and solving for d by exhausting subsets

The greatest argument α_d,i essentially determines the bound on R_d and hence on v_d. A plausible strategy is therefore to make n runs, but to independently post-process all subsets of n − t pairs from the resulting n pairs, for t a constant.

To select n when using this strategy, we specify a bound B on the number of vectors v_d that we accept to enumerate in each lattice of dimension n − t + 1, and follow Section 6.1.1 to select the minimum n respecting this bound with probability at least q_d, including only the smallest n − t arguments α_d,i when bounding R_d.

With probability q_d, the post-processing then heuristically requires at most B lattice vectors to be enumerated in at most (nt) lattices of dimension n − t + 1. Note that t must be small as the binomial coefficient grows rapidly in t.

6.1.4 Optimizations when r is known

Note that when r is known, the argument α_r,i = {rj_i}_2^m+ℓ is known for 1 ≤ i ≤ n, and α_r,i provides information on α_d,i as the arguments are pairwise correlated. When constructing subsets of n − t pairs from the n pairs (j_i, k_i), the pairs should be included in ascending order sorted by |α_r,i|. In general, pairs such that |α_r,i| exceeds some bound may be rejected as large |α_r,i| identify erroneous runs.

6.2 Recovering r from a set of n pairs

To recover r, we instead use that urj=({rj1}2m+ℓ, …, {rjn}2m+ℓ, r)∈Lj is a short vector by construction. More specifically, we use that urj is within a D-dimensional hypersphere in L^j of radius

Rr=|urj|=∑i=1n{rji}2m+ℓ2︸αr,i2+r2=∑i=1nαr,i2+r2

centered at the origin. In close analogy with [7] and the previous section, we may recover urj and hence r by enumerating all vectors in this hypersphere. Heuristically, we expect the hypersphere to contain v_r = V_D(R_r) / det L^j lattice vectors.

This generalization was hinted at in the pre-print of [8]. Furthermore, it is similar to the method employed by Seifert in [24], where he uses what he refers to as simultaneous Diophantine approximation techniques to generalize Shor's [25] continued fractions expansion-based post-processing to higher dimensions. In the case of Shor's original order-finding algorithm, the fact that the problem of finding a continued fraction may be perceived as a lattice problem is observed in [14].

We prefer to describe the post-processing in terms of a shortest vector problem, as this gives us two lattice problems in the same lattice L^j, and as we may re-use the tools previously introduced to estimate the number of runs n required to solve the problem.

6.2.1 Estimating the minimum n required to solve for r

The radius R_r depends on j_i via α_r,i for 1 ≤ i ≤ n. For fixed n and probability q_r, we proceed in analogy with [7] and estimate the minimum radius R˜r such that

(12) Pr[Rr:=∑i=1nαr,i2+r2≤R˜r]≥qr

by sampling α_r,i from the probability distribution. For details on how the estimate is computed, see Section 6.3. It follows from (12) that

(13) Pr[vr:=VD(Rr)detLj≤VD(R˜r)2(m+ℓ)n]≥qr.

This provides a heuristic bound on the number of lattice vectors v_r that at most have to enumerated to solve for r, and that holds with probability at least q_r.

6.2.2 Selecting n and solving for r

One strategy when solving for r is to use the heuristic to select n such that v_r is below a bound equal to the maximum number of vectors that it is computationally feasible to enumerate, with probability q_r. This strategy minimizes n at the expense of performing potentially computationally expensive post-processing.

Another strategy is to select n such that v_r < 2 with probability q_r, so that there is only one lattice vector in the hypersphere by the heuristic. In theory, this enables us to find urj with probability q_r by computing the shortest non-zero vector in L^j.

In practice, the heuristic is good when r is prime, as is typically the case when computing discrete logarithms in cryptographic settings. If r is composite, the heuristic is still good, but it may be necessary to perform a small search to find r if r has one or more small prime factors, see Section 6.2.3. If r has many small prime factors, and n is close to one, an additional classical post-processing step may be required to solve efficiently for r, as there may then exist an artificially short non-zero vector in L^j. This additional step is described in Section 6.2.4.

A third strategy is to independently post-process subsets of the pairs output by the quantum computer, in analogy with the procedure described in Section 6.1.3.

6.2.3 Handling composite r

Assume that r is composite. Let gcd(α_r,₁, . . ., α_r,n, r) = 2^κ_r o for o odd. Let t be the greatest integer on [0, κ_r] for which

αr,i/(2to)={rji}2m+ℓ/(2to)={rji/(2to)}2m+ℓ

for all i ∈ [1, n]. Then |urj|/(2to)∈Lj and |urj|/(2to)≤|urj| , so urj/(2to) and r/(2^to) will likely be recovered in the post-processing instead of urj and r.

For q an odd prime divisor of r, the probability of q also dividing α_r,i for all i ∈ [1, n] is approximately q⁻ⁿ. This implies that r may in general be recovered from r/(2^to) by exhausting t and o, as the search space is expected to be small: It is only if r has very many small odd prime divisors, and if n is close to one, that problems may potentially arise. Such problematic cases may be handled efficiently by introducing an additional classical post-processing step, see the next section.

6.2.4 Handling partially smooth r

Let 𝒫 be the set of all prime factors ≤ cm, for c ≥ 1 some small constant, and let υ_q be the greatest integer such that q^υ_q < 2^m. Furthermore, for r′ = r/(2^to) let

g˜=[∏q∈𝒫qυq] g and g˜f=[r′∏q∈𝒫 \ {f}qυq] g for f∈𝒫,

where bracket notation is used to denote generalized exponentiations. Computing g˜ requires at most 2m · #𝒫 ≤ 2cm² group operations^[1] to be evaluated classically, for #𝒫 the cardinality of 𝒫. It may hence be done efficiently.

As previously explained, when r is partially very smooth, the classical post-processing algorithm is likely to return r′ = r/(2^to), where o may be large, but where all prime factors of o are small. Assume that all prime factors of 2^to are ≤ cm. It must then be that [r′] g˜≡1 , enabling us to quickly test if r′ is on said form. Once r′ = r/(2^to) is found, it is easy to find r: For all f ∈ 𝒫, compute g˜f and find the smallest non-negative integer e_f such that [fef] g˜f≡1 . Then

2to=∏f∈𝒫fef

allowing r = 2^to · r′ to be recovered. Computing g˜f requires at most 2cm² group operations for each f ∈ 𝒫, for a total of at most 2cm² · #𝒫 ≤ 2c²m³ group operations. The recovery procedure is hence efficient. Note that the procedure may be optimized in various ways. The above description conveys the basic idea.

6.2.5 Computing discrete logarithms when r is partially smooth

If r is partially very smooth, it may be hard to determine d, as there may exist an artificially short vector |urj|/2to∈Lj , where o is smooth. Note however that it is still possible to determine d mod r′, by reducing the last component of the vector udj sought for in the classical post-processing algorithm modulo r′ = r/2^to.

Provided we classically solve the discrete logarithm problem in the residual subgroups of small prime power orders f^e_f dividing 2^to, which can be done efficiently, the full logarithm d may then be found via the Chinese remainder theorem. This was originally observed by Pohlig and Hellman [20].

6.3 Estimating R˜d and R˜r

To estimate R˜d and R˜r for m, s and n, known d and r, and a given target success probability q_d or q_r, we exactly follow [7] and sample N sets of n argument pairs {(α_d,₁, α_r,₁), . . . , (α_d,n, α_r,n)} from the probability distribution. For each set, we compute R_d, sort the resulting list of values in increasing order, and select the value at index ⌊(N − 1) q_d⌉ to arrive at our estimate for R˜d . The estimate of R˜r is then computed analogously. The constant N controls the accuracy. If N to be sufficiently large in relation to q_d and q_r, and to the variance in the arguments, we expect this approach to yield sufficiently good estimates.

If we fail to sample one or more argument pairs in a set, we closely follow [7] and over-estimate R˜d and R˜r by letting R_d = R_r = ∞ for the set. The entries for the failed sets will then all be sorted to the end of the lists. If the value of R˜d or R˜r selected from the sorted lists is ∞, no estimate is produced.

Let p be the total probability mass covered by the histogram. The probability of all n pairs in a set being in regions covered by the histogram is then pⁿ. When sampling N sets, the expected number of sets with finite R_d and R_r is Npⁿ. As Nq_d and Nq_r entries, respectively, in the two lists must be finite for the algorithm to produce an estimate, it follows that it is required that q_d, q_r > pⁿ, with some margin to account for the sampling variance, for estimates to be produced.

7 Estimating the number of runs required

We now have the necessary framework in place to compute concrete estimates for the number of runs n required to attain a given minimum success probability q when recovering both d and r for tradeoff factor s for specific problem instances.

In this section, we describe and exemplify the procedure by computing estimates for a a set of concrete problem instances selected in a randomized manner. We furthermore consider how to conjecturally obtain worst case estimates of n.

7.1 Estimating n

To estimate n for a problem instance given by d and r, and for tradeoff factor s, we proceed as follows:

For n = s + 1, s + 2, . . . we first estimate R˜d and R˜r by sampling N = 10⁶ sets of n argument pairs (α_d, α_r), as explained in Section 6.3. We stop and record the smallest n for which the volume quotients v_d < 2 and v_r < 2 with probability q_d and q_r, respectively, where q_d, q_r ≥ q ≥ 99%. As the volume quotients each decrease by approximately a constant factor for every increment in n, the minimum n may in practice be found efficiently by interpolation once a few quotients have been computed.

For selected problem instances, we verify the above initial estimate of n by simulating the quantum algorithm and post-processing the simulated output. More specifically, with the initial estimate of n as our starting point, we sample M = 10³ sets of n pairs (j, k), as explained in Section 5.3, and test whether recovery of both d and r is successful for at least ⌈Mq⌉ sets when executing the post-processing algorithms in Sections 6.1 and 6.2 without enumerating L^j. Depending on the outcome of the test, we either increment or decrement n, and repeat the process, recursively, until the smallest n such that the test passes has been identified. We record this n alongside the initial estimate of n.

In practice, we compute the closest vector in L^j by reducing the lattice basis and applying Babai's [1] nearest plane algorithm. The shortest non-zero vector in L^j is the shortest non-zero vector in the reduced basis. Enumeration is performed using Kannan's [12] original approach, as this is sufficient for our purposes. Note however that there are more efficient approaches in the literature.

To reduce the basis, we closely follow [7] and employ LLL and BKZ [15, 16, 22, 23], as implemented in fpLLL v5.0, with default parameters and a block size of min(n + 1, 10) for all combinations of m, s and n. We first compute a LLL reduction. If it yields no solution, we proceed to compute a BKZ reduction.

7.2 Selecting m and s

Since the cost of estimating n for a given problem instance is non-negligible, we seek to minimize the number of problem instances considered, whilst selecting problem instances that are representative of those that underpin the currently most widely deployed asymmetric cryptosystems.

To this end, for m ∈ {128, 256, 384, 512, 1024, . . . , 8192}, we pick a single combination of d and r using the method described in Section 7.3, and estimate n for a subset of tradeoff factors s ∈ {1, 2, . . . , 8, 10, 20, . . . , 50, 80}, such that the total approximation error is negligible.

In terms of group size, the above choices of m capture most currently widely deployed elliptic curve groups, Schnorr groups and safe-prime groups.

7.3 Selecting d and r given m

For each value of m, we select d and r such that 2^m−1 < d < r < 2^m in a randomized manner.

For as long as d and r do not have very special properties, such as being divisible by large powers of two or being otherwise smooth, the exact values of d and r are of no great significance, however: It is the size of d in relation to r, and the size of r in relation to 2^m, that primarily affects the appearance of the distribution and hence the estimates. To avoid having to tabulate d and r for the m we consider, we read d and r from the decimal expansion of Catalan's constant

G=∑i=0∞(−1)i(2i+1)2=112−132+152−172+⋯.

Specifically, we let cm,k=∑j=0m−22m−2−jg8191k+j for g_i the i^th bit in the decimal expansion of G, and select r = 2^m−1 + c_m,0 and d = 2^m−1 + (c_m,1 mod c_m,0).

7.4 Experiments and results

The estimates of n in Table 1 were produced by executing the procedure described in the previous sections. As may be seen in the table, n asymptotically tends to s + 1 as m tends to infinity for fixed s. For fixed m, it holds that n = s + 1 up to some cutoff point in s.

Table 1

The estimated number of runs n required to solve for both a general discrete logarithm d and group order r, selected as described in Section 7.3, with ≥ 99% success probability and without enumerating the lattice. For details, see Section 7.4. For A the initial and B the simulated estimate, we print B / A, unless B = A; we then only print A. Dash indicates no estimate. For ɛ the total approximation error, an asterisk indicates that 10⁻⁴ ≤ ɛ < 10⁻³. For all other estimates ɛ < 10⁻⁴.

		group and logarithm size m
		128	256	384	512	1024	2048	4096	8192

tradeoff factor s	1	2	2	2	2	2	2	2	2
	2	*3	3	3	3	3	3	3	3
	3	–	4	4	4	4	4	4	4
	4	–	* 5	5	5	5	5	5	5
	5	–	–	6	6	6	6	6	6
	6	–	–	* 7	7	7	7	7	7
	7	–	–	–	8	8	8	8	8
	8	–	–	–	* 10	9	9	9	9
	10	–	–	–	–	11	11	11	11
	20	–	–	–	–	–	22	21	21
	30	–	–	–	–	–	* 35	33 / 32	31
	40	–	–	–	–	–	–	44	42
	50	–	–	–	–	–	–	57	54 / 53
	80	–	–	–	–	–	–	–	– / 88

The estimates are for not enumerating the lattice L^j. By enumerating a bounded number of vectors in the lattice, n may potentially be further reduced. In particular, our experiments show that a single run suffices to solve with probability q ≥ 99% for s = 1, provided we accept to enumerate up to ~ 1.3 · 10³ vectors.

As may furthermore be seen in the table, the initial estimates of n are in general verified by the simulations. In general v_d > v_r. Hence v_d determines the initial estimate for n. Note however that when the heuristic estimate of v_d is close to two, minor discrepancies between the initial estimates and the simulations may arise.

This phenomenon is discussed in [7]: For large tradeoff factors s in relation to m, increasing or decreasing n typically has a small effect on v_d and v_r. This may lead to slight instabilities in the estimates, as v_d may be close to two for several values of n. Discrepancies may also arise, especially for large n, if we fail to find the closest and shortest non-zero vectors in L^j, or if sampling fails. Such discrepancies may be amplified by the difference in the sample sizes N and M.

7.5 Generalizing the results to compute worst case estimates

Recall from Section 5.2 that the larger d is permitted to grow in relation to r, and the larger r is permitted to grow in relation to 2^m, the harder it in general becomes to solve for d and r, assuming neither d nor r to be divisible by large powers of two. This may be seen by computing estimates for various combinations of d and r, or simply by examining the appearance of the two-dimensional distribution for various d and r.

Furthermore, recall from Section 5.2 that the marginal distributions along the α_d and α_r axes in the two-dimensional distribution for general discrete logarithms may be seen to correspond to the distributions induced by the quantum algorithms for computing short discrete logarithms with tradeoffs, and orders with tradeoffs, respectively. Both correspondences may be observed numerically by comparing distributions computed for specific problem instances. The correspondence in α_r may furthermore be shown analytically, see Appendix D.

As v_d > v_r in general when d is large in relation to r, and neither d nor r are divisible by large powers of two, we therefore in general expect estimates of n for computing the general discrete logarithm d in a group of m bit order r to agree with estimates of n for computing the short discrete logarithm d, when m is taken as the upper bound on the bit length of d. For d and r selected as in Section 7.3, this correspondence may be observed in practice by comparing the estimates in Table 1 to those in Table B1 in Appendix B.

The above implies that we may conjecturally claim to obtain worst case estimates of n for general discrete logarithms by computing estimates of n for maximal short discrete logarithms d = 2^m − 1 when m is taken as an upper bound on the bit length of d. Such estimates are provided in Table B2 in Appendix B.

Note that when comparing estimates of n for general discrete logarithms to those for short discrete logarithms and orders, the comparison must of course be restricted to combinations of m and s such that the total approximation error is negligible. It is reasonable to presume that the correspondence between the distributions would continue to hold even if s was to be permitted to grow a bit past the point where the total approximation error becomes non-negligible for a given m, since the error bound is by no means tight. However, we can not demonstrate the correspondence for such m and s.

8 Order finding with tradeoffs

The algorithm for computing general discrete logarithms in this paper does not require the group order to be known, as neither the quantum algorithm nor the classical post-processing algorithm makes explicit use of the order. If the order of the group is unknown, it may be computed from the same set of pairs (j, k) output by the quantum computer as is used to compute the logarithm.

This implies that the algorithm may be used as an order-finding algorithm. When only the order is of interest, only j need to be computed, as k is not used by the post-processing algorithm that recovers the order. The second stage of the quantum algorithm where k is computed need therefore not be executed when the goal is to perform order finding. If the second stage is removed, the quantum algorithm reduces to the algorithm proposed by Seifert [24]. For s = 1 this algorithm in turn reduces to Shor's order-finding algorithm.

This provides a link between our works on computing discrete logarithms, Seifert's work on order finding, and Shor's original work. As for post-processing, Seifert generalizes Shor's continued fractions-based post-processing algorithm to higher dimensions. We instead use lattice-based post-processing.

In Appendix A, we provide a description of Shor's and Seifert's quantum algorithms for order finding, a complete analysis of the probability distributions that they induce, and estimates for the number of runs n required to solve various problem instances for r when using lattice-based post-processing.

9 Summary and conclusion

Unlike Shor's algorithm, our algorithm does not require the group order to be known. It simultaneously computes both the order and the logarithm. This allows it to outperform Shor's original algorithms with respect to the overall number of group operations that need to be evaluated quantumly in some cases even when not making tradeoffs. One cryptographically relevant example of such a case is the computation of discrete logarithms in Schnorr groups of unknown order.

We analyze the probability distributions induced by our algorithm, and by Shor's and Seifert's order-finding algorithms, describe how all of these algorithms may be simulated when the solution is known, and estimate the number of runs required for a given minimum success probability when making different tradeoffs.

When solving using lattice-based post-processing without enumerating L^j, the number of runs n required for a fixed tradeoff factor s tends to s + 1 asymptotically as m tends to infinity. By enumerating, n may be further reduced. Notably, when not making tradeoffs, a single run suffices to solve with at least 99% success probability, provided a small number of lattice vectors are enumerated.

Throughout this work, we have assumed the bit length of r to be known. However, it should be straight-forward to generalize the analysis to handle situations where only an upper bound on the bit length of r is known.

Article note: Funding and support was provided by the Swedish NCSA that is a part of the Swedish Armed Forces.

Acknowledgement

I am grateful to Johan Håstad for valuable comments and advice, and to Lennart Brynielsson and other colleagues for their comments on early versions of this manuscript. Funding and support for this work was provided by the Swedish NCSA that is a part of the Swedish Armed Forces. Computations were performed at PDC at KTH. Access was provided by SNIC.

References

[1] L. Babai, On Lovász’ lattice reduction and the nearest lattice point problem, Combinatorica 6 (1986), no. 1, 1–13.10.1007/BFb0023990Suche in Google Scholar

[2] E. Barker et al., Recommendation for Pair-Wise Key-Establishment Schemes Using Discrete Logarithm Cryptography, NIST SP 800–56A (2018), rev. 3.10.6028/NIST.SP.800-56Ar3Suche in Google Scholar

[3] W. Diffie and M.E. Hellman, New Directions in Cryptography, IEEE Transactions on Information Theory 22 (1976), no. 6, 644–654.10.1109/TIT.1976.1055638Suche in Google Scholar

[4] G. Einarsson, Probability Analysis of a Quantum Computer, ArXiv quant-ph/0303074 (2003).Suche in Google Scholar

[5] M. Ekerå, Modifying Shor's algorithm to compute short discrete logarithms, IACR ePrint Archive Report 2016/1128 (2016).Suche in Google Scholar

[6] M. Ekerå, On completely factoring any integer efficiently in a single run of an order finding algorithm, ArXiv 2007.10044 (2020).10.1007/s11128-021-03069-1Suche in Google Scholar

[7] M. Ekerå, On post-processing in the quantum algorithm for computing short discrete logarithms, Des. Codes, Cryptogr. 88 (2020), no. 11, 2313–2335.10.1007/s10623-020-00783-2Suche in Google Scholar

[8] M. Ekerå and J. Håstad, Quantum algorithms for computing short discrete logarithms and factoring RSA integers. In: Lange T., Takagi T. (Eds) Post-Quantum Cryptography, PQCrypto 2017, LNCS (2017), no. 10346, 347–363.10.1007/978-3-319-59879-6_20Suche in Google Scholar

[9] D. Gillmor, Negotiated Finite Field Diffie-Hellman Ephemeral Parameters for Transport Layer Security (TLS), RFC 7919 (2016).10.17487/RFC7919Suche in Google Scholar

[10] R.B. Griffiths and C.-S. Niu, Semiclassical Fourier Transform for Quantum Computation, Phys. Rev. Lett. 76 (1996), no. 17, 3228–3231.10.1103/PhysRevLett.76.3228Suche in Google Scholar

[11] J. Håstad, A. Schrift and A. Shamir, The Discrete Logarithm Modulo a Composite Hides O(n) bits, J. Comput. Syst. Sci. 47 (1993), no. 3, 376–404.10.1016/0022-0000(93)90038-XSuche in Google Scholar

[12] R. Kannan, Improved algorithms for integer programming and related lattice problems. In: Proceedings of the 15th Symposium on the Theory of Computing, STOC ’83 (1983), 99—108.10.1145/800061.808749Suche in Google Scholar

[13] T. Kivinen and M. Kojo, More Modular Exponentiation (MODP) Diffie-Hellman groups for Internet Key Exchange, RFC 3526 (2003).10.17487/rfc3526Suche in Google Scholar

[14] A. Koenecke and P. Wocjan, Recovering the period in Shor's algorithm with Gauss’ algorithm for lattice basis reduction, ArXiv 1210.3003 (2013).Suche in Google Scholar

[15] A. Korkine and G. Zolotareff, Sur les formes quadratiques, Math. Ann. 6 (1873), no. 3, 366–389.10.1007/BF01442795Suche in Google Scholar

[16] H.W. Lenstra, A.K. Lenstra and L. Lovász, Factoring Polynomials with Rational Coefficients, Math. Ann. 261 (1982), no. 4, 515–534.10.1007/BF01457454Suche in Google Scholar

[17] G.L. Miller, Riemann's hypothesis and tests for primality, J. Comput. Syst. Sci. 13 (1976), no. 3, 300–317.10.1145/800116.803773Suche in Google Scholar

[18] M. Mosca and A. Ekert: The Hidden Subgroup Problem and Eigenvalue Estimation on a Quantum Computer. In: Proceeding from the First NASA International Conference, QCQC ’98, LNCS (1999), no. 1509, 174–188.10.1007/3-540-49208-9_15Suche in Google Scholar

[19] S. Parker and M.B. Plenio, Efficient Factorization with a Single Pure Qubit and log N Mixed Qubits, Phys. Rev. Lett. 85 (2000), no. 14, 3049–3052.10.1103/PhysRevLett.85.3049Suche in Google Scholar

[20] S.C. Pohlig and M.E. Hellman, An Improved Algorithm for Computing Logarithms over GF(p) and Its Cryptographic Significance, IEEE Trans. Inf. Theory 24 (1978), no. 1, 106–110.10.1109/TIT.1978.1055817Suche in Google Scholar

[21] R.L. Rivest, A. Shamir and L. Adleman, A Method for Obtaining Digital Signatures and Public-Key Cryptosystems, Commun. ACM 21 (1978), no. 2, 120–126.10.21236/ADA606588Suche in Google Scholar

[22] C.-P. Schnorr, A hierarchy of polynomial time lattice basis reduction algorithms, Theor. Comput. Sci. 53 (1987), no. 2–3, 201–224.10.1016/0304-3975(87)90064-8Suche in Google Scholar

[23] C.-P. Schnorr and M. Euchner: Lattice basis reduction: Improved practical algorithms and solving subset sum problems, Math. Program. 66 (1994), no. 1–3, 181–199.10.1007/3-540-54458-5_51Suche in Google Scholar

[24] J.-P. Seifert, Using fewer qubits in Shor's factorization algorithm via simultaneous Diophantine approximation. In: Naccache, D. (ed.) CT-RSA 2001, LNCS (2001), no. 2020, 319–327.10.1007/3-540-45353-9_24Suche in Google Scholar

[25] P.W. Shor, Algorithms for quantum computation: discrete logarithms and factoring. In: Proceedings of the 35th Annual Symposium on Foundations of Computer Science, SFCS ’94 (1994), 124–13410.1109/SFCS.1994.365700Suche in Google Scholar

[26] P.W. Shor: Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer, SIAM J. Comput. 26 (1997), no. 5, 1484–1509.10.1137/S0097539795293172Suche in Google Scholar

A Order finding with tradeoffs

In this appendix, we recall Shor's [25] and Seifert's [24] order-finding algorithms, analyze the probability distributions they induce, show how they may be simulated, and estimate the number of runs n required to solve for r when using lattice-based post-processing.

A.1 The quantum algorithm

Given a generator g of a finite cyclic group of order r of length ~ m bits, Shor's order-finding algorithm [25] outputs an integer j that yields ~ m bits on r. Seifert [24] enabled tradeoffs in Shor's algorithm by modifying it to yield ~ m/s bits on r in each run, for s the tradeoff factor. For s = 1 Seifert's algorithm reverts to Shor's algorithm. This allows us to conveniently describe both algorithms below:

Let m be the integer such that 2^m−1 < r < 2^m, let ℓ = ⌈m/s⌉, and let
Ψ=12m+ℓ ∑a=02m+ℓ−1| a,0 〉.
Compute [a] g and store the result in the second register to obtain
Ψ=12m+ℓ ∑a=02m+ℓ−1| a,[a] g 〉.
Compute a QFT of size 2^m+ℓ of the first register to obtain
Ψ=12m+ℓ∑a=02m+ℓ−1 ∑j=02m+ℓ−1e2πi aj/2m+ℓ| j,[a] g 〉.
Observe the system to obtain j and y = [e] g where e = a mod r.

Note that Seifert's interpretation of the advantage of his algorithms is that he saves control qubits. This is not the case when recycling control qubits; see the discussion in Section 2 for a more modern interpretation of the advantage.

A.2 The probability of observing j and y

Above, in step 4, the integer j and element y = [e] g are obtained with probability

(A1) 122(m+ℓ)|∑aexp[2πi2m+ℓ aj]|2

where the sum is over all a on 0 ≤ a < 2^m+ℓ such that a ≡ e (mod r).

In this section, we seek to place (A1) on closed form. To this end, we first perform a variable substitution to obtain a contiguous summation interval. As all a that fulfill the condition that a ≡ e (mod r) are on the form a = e + n_rr where 0 ≤ n_r ≤ (2^m+ℓ − 1 − e)/r, substituting a for e + n_rr in (A1) and adjusting the phase yields

122(m+ℓ) |∑nr=0⌊(2m+ℓ−1−e)/r⌋exp[2πi2m+ℓ αrnr]|2=122(m+ℓ) |∑nr=0⌊(2m+ℓ−1−e)/r⌋eiθrnr|2

where α_r = {rj}_2^m+ℓ and θ_r = θ(α_r) = 2πα_r/2^m+ℓ. Summing over all e yields

for β such that β ≡ 2^m+ℓ (mod r), as for all e on 0 ≤ e < β it then holds that

⌊(2m+ℓ−1−e)/r⌋=⌊(2m+ℓ−1)/r⌋

whereas for all e on β ≤ e < r, it holds that

⌊(2m+ℓ−1−e)/r⌋=⌊(2m+ℓ−1)/r⌋−1.

Note that β ≢ 0 (mod r) since we require r to be on 2^m−1 < r < 2^m.

A.2.1 Closed-form expressions

Assuming θ_r ≠ 0, we may write (A2) on closed form as

β22(m+ℓ)|eiθr (⌊(2m+ℓ−1)/r⌋+1)−1eiθr−1|2+r−β22(m+ℓ) |eiθr ⌊(2m+ℓ−1)/r⌋−1eiθr−1|2.

Otherwise, if θ_r = 0, we may write (A2) on closed form as

β22(m+ℓ)(⌊(2m+ℓ−1)/r⌋+1)2+r−β22(m+ℓ) (⌊(2m+ℓ−1)/r⌋)2.

This step of the analysis is similar to a step in the analysis of Einarsson [4].

A.3 Distribution of integers j

In this section we analyze the distribution of integers j that yield α_r.

Definition A.1

Let κ_r denote the greatest integer such that 2^κ_r divides r.

Definition A.2

An argument α_r is admissible if there exists an integer j on 0 ≤ j < 2^m+ℓ such that α_r = {rj}_2^m+ℓ.

Claim A.3

All admissible arguments α_r = {rj}_2^m+ℓ are multiples of 2^κ_r.

Proof

As 2^κ_r | r and the modulus is a power of two the claim follows.

Lemma A.4

The set of integers j on 0 ≤ j < 2^m+ℓ that yield the admissible argument α_r is given by

j=(αr2κr(r2κr)−1+2m+ℓ−κrtr) mod 2m+ℓ

as t_r runs trough all integers on 0 ≤ t_r < 2^κ_r. Each admissible argument α_r hence occurs with multiplicity 2^κ_r.

Proof

As α_r ≡ rj (mod 2^m+ℓ), the lemma follows by solving for j.

A.4 Simulating the quantum algorithm

In this section, we first construct a high-resolution histogram for the probability distribution induced by the quantum algorithm for known r. We then proceed to sample the histogram to simulate the quantum algorithm.

A.4.1 Constructing the histogram

To construct the histogram, we exactly follow [7]: We divide the argument axis into regions and subregions and integrate the closed-form probability expression numerically in each subregion.

First, we subdivide the negative and positive sides of the argument axis into 30 + μ regions where μ = min(ℓ − 2, 11). Each region thus formed may be uniquely identified by an integer η_r by requiring that for all α_r in the region

2|ηr|≤|αr|≤2|ηr|+1 and sgn(αr)=sgn(ηr)

where m − 30 ≤ |η_r| < m + μ − 1. Then, we subdivide each region into subregions identified by an integer ξ_r by requiring that for all α_r in the subregion

2|ηr|+ξr/2ν≤|αr|≤2|ηr|+(ξr+1)/2ν

for ξ_r an integer on 0 ≤ ξ_r < 2^ν and ν a resolution parameter.

For each subregion, we compute the approximate probability mass contained within the subregion by applying Simpson's rule, followed by Richardson extrapolation to cancel the linear error term. Simpson's rule is hence applied 2^ν (1 + 2) times in each region. Each application requires the probability to be computed in up to three points (the two endpoints and the midpoint), for which purpose we use the closed-form expression developed in Section A.2.1.

Note that we should furthermore multiply by the multiplicity of arguments 2^κ_r, see Lemma A.4 in Section A.3, and divide by 2^κ_r to account for the density of distinct pairs in the region. However, these operations cancel. Note also that this method of constructing the histogram assumes κ_r to be small in relation to m.

To obtain a high degree of accuracy in the tail, we fix to ν = 11 for all regions. This enables us use this histogram as a reference when adaptively selecting the resolution for the two-dimensional histogram in Section 5.1, see Lemma D.1.

A.4.2 Understanding the probability distribution

The probability distribution is plotted on the signed logarithmic argument axis in Figure 4 for m = 2048 and s = 30, and for r selected as explained in Section 7.3. The regions form two contiguous symmetric areas on the argument axis, as is illustrated in Figure A1. As expected, the distribution plotted is virtually identical to the marginal distribution along the vertical α_r axis in Figure 4.

Figure A1

The probability distribution induced by the order-finding algorithm, computed as in Section A.4.1, for m = 2048 and s = 30, and for r selected as in Section 7.3. To facilitate printing, the resolution has been reduced in this figure.

The probability mass is located in the regions where |α_r| ~ 2^m, whereas for random outputs the argument would be of size ~ 2^m+ℓ. Hence, a single run of the quantum algorithm yields ~ ℓ ~ m/s bits of information on r.

A.4.3 Sampling the probability distribution

To sample an argument α_r from the distribution, we exactly follow [7]: We first sample a subregion from the histogram and then sample α_r uniformly at random this subregion, with the restriction that 2^κ_r must divide α_r so that α_r is admissible. To sample a subregion from the histogram, we order all subregions in the histogram by probability, and compute the cumulative probability up to and including each subregion in the resulting ordered sequence, in analogy with Section 5.3.

Then, we sample a pivot uniformly at random from [0, 1), and return the first subregion in the ordered sequence for which the cumulative probability is greater than or equal to the pivot. The sampling operation fails if the pivot is greater than the cumulative probability of the last subregion in the sequence.

To sample an integer j from the distribution, we first sample an argument α_r and then select an integer j yielding α_r uniformly at random from the set of all such integers using Lemma A.4. More specifically, we first sample an integer t_r uniformly at random on the admissible interval for t_r and then compute j from α_r and t_r as described in Lemma A.4.

A.5 Classical post-processing

The probability distribution induced by the quantum algorithm in Section A.1 is virtually identical to the marginal distribution along the α_r axis in Section 5.1. Hence, the classical post-processing algorithm in Section 6.2 may be used to solve sets of n integers j output by the quantum algorithm in Section A.1 for r.

A.6 Estimating the number of runs required

To estimate n for problem instance given by r, we exactly follow [7]:

For n = s + 1, s + 2, . . . we first estimate R˜r by sampling N = 10⁶ sets of n arguments α_r, as explained in Sections A.4.3 and 6.3, and record the smallest n for which the volume quotient v_r < 2 with probability q_r ≥ q ≥ 99%. With this estimate of n as our starting point, we then sample M = 10³ sets of n integers j, as explained in Section A.4.3, and test whether recovery of r is successful for at least ⌈Mq⌉ sets when executing the post-processing algorithm in Section 6.2 without enumerating L^j. Depending on the outcome of the test, we either increment or decrement n, and repeat the process, recursively, until the smallest n such that the test passes has been identified.

Executing this procedure, for m and s selected as described in Section 7.2, both for r selected as explained in Section 7.3, and for maximal r = 2^m − 1, produced the estimates in Table A1 and Table A2, respectively. Note that for A the initial and B the simulated estimate, we print B / A, unless B = A, we then only print A. Note furthermore that we have excluded m = 384 to reduce the table width.

Table A1

The estimated number of runs n required to solve for an order r, selected as in Section 7.3, with ≥ 99% success probability, without enumerating the lattice.

		group size m
		128	256	512	1024	2048	4096	8192

tradeoff factor s	1	2	2	2	2	2	2	2
	2	3	3	3	3	3	3	3
	3	4	4	4	4	4	4	4
	4	6	5	5	5	5	5	5
	5	7	6	6	6	6	6	6
	6	9	8	7	7	7	7	7
	7	12 / 11	9	8	8	8	8	8
	8	16 / 15	11	10	9	9	9	9
	10	– / 25	14	12	11	11	11	11
	20	–	– / 54	28 / 29	24	22	21	21
	30	–	–	– / 53	39 / 38	34	32	31
	40	–	–	–	– / 58	48 / 47	44	42
	50	–	–	–	–	– / 63	56	53
	80	–	–	–	–	–	– / 95	– / 87

Table A2

The estimated number of runs n required to solve for a maximal order r = 2^m − 1 with ≥ 99% success probability, without enumerating the lattice.

		group size m
		128	256	512	1024	2048	4096	8192

tradeoff factor s	1	2	2	2	2	2	2	2
	2	3	3	3	3	3	3	3
	3	4	4	4	4	4	4	4
	4	6	5	5	5	5	5	5
	5	8 / 7	6	6	6	6	6	6
	6	9	8	7	7	7	7	7
	7	11 / 12	9	8	8	8	8	8
	8	17 / 16	11	10	9	9	9	9
	10	– / 25	14	12	11	11	11	11
	20	–	– / 55	30 / 29	23 / 24	22	21	21
	30	–	–	– / 53	37 / 39	34	32	31
	40	–	–	–	– / 59	48 / 47	44	42
	50	–	–	–	–	– / 63	57 / 56	53
	80	–	–	–	–	–	– / 95	– / 87

The tabulated estimates are for not enumerating the lattice L^j. By enumerating a bounded number of vectors in the lattice, n may potentially be further reduced. In particular, our experiments show that a single run suffices to solve with probability q ≥ 99% for s = 1, provided we accept to enumerate up to ~ 3.5 · 10² vectors.

A.7 Applications of order finding to integer factoring

Quantum algorithms for order finding may be used to factor integers, as was first proposed by Shor [25] using a reduction due to Miller [17]. To factor a composite integer N, that is odd and not a perfect prime power, Shor originally proposed to proceed as follows: Pick an integer g ∈ (1, N) and compute D = gcd(g, N). If D ≠ 1, then D is a non-trivial factor of N. In practice, small and moderate size factors of N would typically be removed before attempting to factor N via order finding. Hence, it is unlikely that factors would be found in this manner.

If D = 1, then g may be perceived as a generator of a cyclic subgroup 〈g〉⊂ℤN* , and its order r computed using a quantum algorithm for order finding.

As g^r ≡ 1 (mod N), it must be that g^r − 1 ≡ 0 (mod N). If r is even and g^r^/2 ≢ − 1 (mod N), we have that g^r^/2 ± 1 ≢ 0 (mod N), whilst

gr−1≡(gr/2−1)(gr/2+1)≡0 (mod N),

so non-trivial factors of N may be found by computing gcd((g^r^/2 mod N) ± 1, N). This reduces the integer factoring problem to an order finding problem.

Shor originally proposed to use this reduction, and to simply re-run the whole algorithm if any of the above requirements are not fulfilled, or if the order-finding algorithm fails to yield r. In [25], Shor lower-bounds the probability of his order-finding algorithm yielding r in a single run, and of non-trivial factors of N being found given r, so as to obtain a lower bound on the overall success probability. A number of improvements have since been proposed, see the introduction to [6] for an overview.

In this appendix, we have shown that the probability of Shor's original order-finding algorithm yielding r in a single run is very close to one. Furthermore, we have estimated the number of runs required to obtain a similarly high success probability when making tradeoffs in Seifert's order-finding algorithm. In [6], it is shown that any integer N may be completely factored classically into all of its constituent prime factors with very high probability after a single call to an order-finding algorithm. Hence, the estimates we provide of the number of runs required for Shor's and Seifert's order-finding algorithms to yield r are also estimates of the number of runs required to completely factor N via these order-finding algorithms.

A.7.1 Factoring RSA integers

Note that if N is an RSA [21] integer, as is typically the case in cryptographic applications, a more efficient approach to factoring N is to use the algorithm of Ekerå and Håstad [8]. This algorithm reduces the RSA integer factoring problem to a short discrete logarithm problem via [11] and solves this problem quantumly.

As is shown in [7], the quantum part of Ekerå-Håstad's algorithm imposes less requirements on the quantum computer than Shor's or Seifert's order-finding algorithms, in each run and overall, both when making and not making tradeoffs. The probability of recovering the logarithm d is very close to one. The two prime factors of N may then be recovered deterministically from d.

B Short discrete logarithms with tradeoffs

The experiments in Appendix A for order finding are analogous with those for short discrete logarithms in [7]. For completeness, and so as to enable comparisons, we have run additional experiments for short discrete logarithms following [7], both for maximal d = 2^m − 1, and for d selected as described in Section 7.2. These experiments produced the estimates in Table B1 and Table B2, respectively. Note that for A the initial and B the simulated estimate, we print B / A, unless B = A, we then only print A.

Table B1

The estimated number of runs n required to solve for a short logarithm d, selected as in Section 7.3, with ≥ 99% success probability, without enumerating.

		logarithm size m
		128	256	512	1024	2048	4096	8192

tradeoff factor s	1	2	2	2	2	2	2	2
	2	3	3	3	3	3	3	3
	3	4	4	4	4	4	4	4
	4	6	5	5	5	5	5	5
	5	8	6	6	6	6	6	6
	6	10	8	7	7	7	7	7
	7	13	10 / 9	8	8	8	8	8
	8	18	11	10	9	9	9	9
	10	– / 32	15	12	11	11	11	11
	20	–	– / 71	32 / 30	24	22	21	21
	30	–	–	– / 60	40	35	33 / 32	31
	40	–	–	–	– / 62	50 / 48	44	42
	50	–	–	–	–	– / 65	57	54 / 53
	80	–	–	–	–	–	– / 97	– / 88

Table B2

The estimated number of runs n required to solve for a maximal short logarithm d = 2^m − 1 with ≥ 99% success probability, without enumerating.

		logarithm size m
		128	256	512	1024	2048	4096	8192

tradeoff factor s	1	2	2	2	2	2	2	2
	2	3	3	3	3	3	3	3
	3	4	4	4	4	4	4	4
	4	6	5	5	5	5	5	5
	5	8	6	6	6	6	6	6
	6	10	8	7	7	7	7	7
	7	13	9	8	8	8	8	8
	8	18	11	10	9	9	9	9
	10	– / 32	16 / 15	12	11	11	11	11
	20	–	– / 71	31	25 / 24	22	21	21
	30	–	–	– / 60	40	35	32	32 / 31
	40	–	–	–	– / 62	49 / 48	45 / 44	42
	50	–	–	–	–	– / 65	57	54 / 53
	80	–	–	–	–	–	– / 97	– / 88

C Soundness of the closed-form approximation

In this appendix, we demonstrate the fundamental soundness of the closed-form approximation to P(θ_d, θ_r) that we derived in Section 3. This appendix is rather technical and may be considered to constitute supplementary material.

C.1 Introduction and recapitulation

Recall that by Theorem 3.22, the probability P(θ_d, θ_r) of the quantum algorithm yielding (j, k), with associated angle pair (θ_d, θ_r), summed over all r group elements y = [e] g ∈ 𝔾, may be approximated by

P˜(θd,θr)=22σr22(m+2ℓ)f(θr) g(θd,θr)

where we have introduced some new notation in the form of the two functions

f(θr)=|∑nr=0⌈2m+ℓ/r⌉−1eiθrnr|2 g(θd,θr)=|∑t=02ℓ−σ−1ei(2σθd+⌈−2σd/r⌉θr)t|2

that we shall use throughout this section, and that may both be placed on closed form. The error when approximating P(θ_d, θ_r) by P˜(θd,θr) is bounded by

|P˜(θd,θr)−P(θd,θr)|≤e˜(θd,θr),

again by Theorem 3.22, where the function e˜(θd,θr) is defined.

C.1.1 Overview of the soundness argument

In what follows, we demonstrate the fundamental soundness of the above closed-form approximation, by summing P˜(θd,θr) analytically to show that a large fraction of the probability mass is within a specific region of the plane, and by summing e˜(θd,θr) analytically to show that the total approximation error in this region is negligible. Asymptotically, in the limit as m tends to infinity for fixed s, the fraction of the probability mass captured tends to one whilst the error tends to zero. This implies that P˜(θd,θr) asymptotically captures the probability distribution completely and exactly.

C.2 Preliminaries

Before we proceed as outlined above, we first introduce some preliminaries.

Lemma C.1

Let φ ∈ ℝ and θ(u) = 2πu/2^ω for ω > 0 an integer. Then

∑u=02ω+c−ς−1 |∑t=0N−1ei(2ςθ(u)+φ)t|2=2ω+c−ςN

for integers c, ς and N such that c ≥ 0, 0 ≤ ς < ω and 0 < N ≤ 2^ω−ς.

Proof

For any ϕ ∈ ℝ it holds that

|∑t=0N−1ei(2ςϕ+φ)t|2=(∑t=0N−1ei(2ςϕ+φ)t)(∑t=0N−1e−i(2ςϕ+φ)t)=∑t=−N+1N−1(N−|t|) ei(2ςϕ+φ)t=N+∑t=1N−1(N−t) (ei(2ςϕ+φ)t+e−i(2ςϕ+φ)t).

Hence

∑u=02ω+c−ς−1|∑t=0N−1ei(2ςθ(u)+φ)t|2=∑u=02ω+c−ς−1(N+∑t=1N−1(N−t) (ei(2ςθ(u)+φ)t+e−i(2ςθ(u)+φ)t))=2ω+c−ςN+ ∑t=1N−1(N−t)∑u=02ω+c−ς−1(ei(2ςθ(u)+φ)t+e−i(2ςθ(u)+φ)t)︸=0

as for any integer t on 0 < |t| < N ≤ 2^ω−ς and ς < ω, the sum

∑u=02ω+c−ς−1 ei(2ςθ(u)+φ)t=eiφtei2ς(2π/2ω) 2ω+c−ςt−1ei2ς(2π/2ω)t−1=eiφte2πi 2ct−1e2πi 2ς−ωt−1=0

as the denominator is non-zero, and so the lemma follows.

C.2.1 Bounding tail regions

Claim C.2

For Δ and N integers such that 1 < Δ < N it holds that

∫ΔNduu2<∑z=ΔN−11z2<∫Δ−1N−1duu2<1Δ−1≤2Δ.

Proof

∫zz+1duu2=1z+z2<1z2<1z2−z=∫z−1zduu2

for z any integer such that z > 1, it follows that

∫ΔNduu2=∑z=ΔN−1(∫zz+1duu2)<∑z=ΔN−11z2<∑z=ΔN−1( ∫z−1zduu2)=∫Δ−1N−1duu2

where, for Δ and N integers on 1 < Δ < N, it holds that

∫Δ−1N−1duu2=1Δ−1−1N−1<1Δ−1≤2Δ

and so the claim follows.

Claim C.3

For any ϕ ∈ ℝ such that 0 < |ϕ| ≤ π it holds that

|∑t=0N−1eiϕt|2≤24ϕ2.

Proof

As ϕ ≠ 0, we have by Claim C.4 below that

|∑t=0N−1eiϕ|2=|eiNϕ−1eiϕ−1|2≤22|eiϕ−1|2≤24ϕ2

and so the claim follows.

Claim C.4

|e^iϕ − 1| ≥ |ϕ|/2 for any ϕ ∈ ℝ such that |ϕ| ≤ π.

Proof

It suffices to show that |e^iϕ − 1|² = 2(1 − cos ϕ) ≥ ϕ²/4 from which the claim follows as cos ϕ ≤ 1 − ϕ²/8 for any ϕ ∈ ℝ such that |ϕ| ≤ π.

C.2.2 Intervals of admissible arguments and angles

To facilitate the analysis, we need notation to handle intervals of admissible angles:

Definition C.5

Let Θ_r(I) be the set of distinct admissible θ_r on the interval I.

Definition C.6

For a fixed admissible θ_r, let Θ_d(I, θ_r) be the set of distinct admissible θ_d on the interval I.

C.2.3 Parameterizing the admissible arguments and angles

Furthermore, we need a convenient method for parameterizing the distinct admissible argument pairs (α_d, α_r), or angle pairs (θ_d, θ_r).

Claim C.7

The admissible arguments α_d and α_r may be parameterized by

αd(ud,ur)=(δrur mod 2m−γ)+2m−γud αr(ur)=2κrur

and the corresponding admissible angles θ_d and θ_r may be parameterized by

θd(ud,ur)=2π2m+ℓ αd(ud,ur) θr(ur)=2π2m+ℓ αr(ur)

for integers u_d ∈ [−2^ℓ+γ−1, 2^ℓ+γ−1) and u_r ∈ [−2^{m+ℓ−κ_r−1}, 2^{m+ℓ−κ_r−1}) when not accounting for multiplicity.

Proof

By Lemma 4.4, the admissible argument pairs (α_d, α_r) are vectors in L^α in the region of the plane where α_d, α_r ∈ [−2^m+ℓ−1, 2^m+ℓ−1). The parameterization takes u_r times the first row and u_d times second row of the basis matrix for L^α. It furthermore uses the second row to reduce the starting point δ_r u_r modulo 2^m−γ. The claim follows from this analysis.

C.3 Establishing a baseline

We begin by proving that the sum of P˜(θd,θr) over all admissible angle pairs (θ_d, θ_r), with multiplicity, in the region where θ_r ∈ [−π, π) and θ_d ∈ [−π/2^σ, π/2^σ), tends to one asymptotically in the limit as m tends to infinity for fixed s.

C.3.1 The inner sum over g(θ_d, θ_r)

Lemma C.8

For θ_d ∈ Θ_d([−π/2^σ, π/2^σ), θ_r), the inner sum

∑θd∈Θd([−π/2σ, π/2σ), θr)g(θd,θr)=22(ℓ−σ)+γ.

Proof

The function g(θ_d, θ_r) is non-negative and periodic in θ_d for fixed θ_r. It cycles exactly 2^σ times on the interval θ_d ∈ [−π, π), as may be seen in Figure C1 where g(θ_d, θ_r) is plotted continuously in θ_d for θ_r fixed to zero. Fixing a different value of θ_r shifts the graph cyclically along the θ_d axis.

Figure C1

The function g(θ_d, 0) plotted continuously in θ_d on the interval |θ_d| ≤ π for σ = 3 and sample parameters selected to make the figure readable.

This implies that we may parameterize θ_d in u_d and u_r using Claim C.7 and sum θ_d(u_d, u_r) over any consecutive sequence of 2^ℓ+γ−σ values of u_d for the fixed u_r given by θ_r to sum over all θ_d ∈ Θ_d([−π/2^σ, π/2^σ), θ_r).

By using this approach and Lemma C.1 we obtain

where we have used that we may shift the interval in ud, and introduced the constant phase

φ=2σ(2π(δrur mod 2m−γ)/2m+ℓ)+⌈−2σd/r⌉θr(ur),

and so the lemma follows.

C.3.2 The outer sum over f (θ_r)

Lemma C.9

For θ_r ∈ Θ_r([−π, π)), the outer sum

∑θr∈Θr([−π, π))f(θr)=2m+ℓ−κr⌈2m+ℓr⌉.

Proof

The function f (θ_r) is non-negative and periodic in θ_r. It cycles exactly once on the interval θ_r ∈ [−π, π). This implies that we may parameterize θ_r in u_r using Claim C.7, and sum over all 2^m+ℓ−κ_r values of u_r to sum over all θ_r ∈ Θ_r([−π, π)). By using this approach and Lemma C.1, we thus obtain

∑θr∈Θr([−π, π))f(θr)=∑ur=−2m+ℓ−κr−12m+ℓ−κr−1−1 |∑nr=0⌈2m+ℓ/r⌉−1eiθr(ur)nr|2=∑ur=02m+ℓ−κr−1 |∑nr=0⌈2m+ℓ/r⌉−1ei(2πur/2m+ℓ−κr)nr|2=2m+ℓ−κr⌈2m+ℓr⌉

by using that we may shift the interval in ur, and so the lemma follows.

C.3.3 Combined result

Lemma C.10

The combined sum over all distinct admissible angle pairs (θ_d, θ_r), in the region where θ_d ∈ [−π/2^σ, π/2^σ) and θ_r ∈ [−π, π), is

∑θr∈Θr([−π, π))θd∈Θd([−π/2σ, π/2σ), θr)P˜(θd,θr)=2γ−κr r2m+ℓ⌈2m+ℓr⌉.

Proof

By combining Lemmas C.8 and C.9, we obtain

∑θr∈Θr([−π, π))θd∈Θd([−π/2σ, π/2σ), θr)P˜(θd,θr)=22σr22(m+2ℓ)∑θr∈Θr([−π, π))f(θr)∑θd∈Θd([−π/2σ, π/2σ), θr)g(θd,θr)=22σr22(m+2ℓ)⋅22(ℓ−σ)+γ⋅2m+ℓ−κr⌈2m+ℓr⌉=2γ−κr⋅r2m+ℓ⌈2m+ℓr⌉

as the inner sum reduces to a constant, and so the lemma follows.

It follows from Lemma C.10 above that the sum of P˜(θd,θr) over all admissible angle pairs (θ_d, θ_r) in the region where θ_d ∈ [−π/2^σ, π/2^σ) and θ_r ∈ [−π, π) tends to one as m tends to infinity for fixed s, when accounting for the fact that each distinct admissible angle pair (θ_d, θ_r) occurs with multiplicity 2^κ_r−γ by Lemma 4.4.

The total approximation error, as upper-bounded by summing e˜(θd,θr) over all admissible angle pairs (θ_d, θ_r), with multiplicity, in the region, is non-negligible however. In the next section we address this problem by reducing the size of the region.

C.4 Adapting the region to reduce the error

In this section, we show that the sum of P˜(θd,θr) over all admissible angle pairs (θ_d, θ_r), with multiplicity, in the central region as defined below, captures a fraction of the probability mass in τ.

Definition C.11

The central region is the part of the plane where |θ_d| ≤ B_d and |θ_r| ≤ B_r, for B_d = 2^τ−ℓ+1π and B_r = B_d/2, and for τ an integer constant such that 1 < τ < ℓ − σ − 1.

In the next section, we describe how the approximation error, as upper-bounded by summing e˜(θd,θr) over all admissible angle pairs (θ_d, θ_r), with multiplicity, in the central region, depends on τ. For appropriate σ and τ, a large fraction of the probability mass is in the central region, whilst the total approximation error is negligible in the region.

Note that by the above definition of B_d and B_r, all argument pairs (α_d, α_r) such that |α_d| ≤ 2^m+τ and |α_r| ≤ 2^m+τ−1 are in the central region. Note furthermore that B_r < B_d = 2^τ−ℓ+1 π ≤ 2^−σ−1 π, so the central region is a subregion of the region we considered in the previous section.

C.4.1 The inner sum over g(θ_d, θ_r)

Lemma C.12

For θ_r ∈ Θ_r([−B_r, B_r]), the inner sum

∑θd∈Θd([−Bd, Bd], θr)g(θd,θr)≥22(ℓ−σ)+γ(1−25π212τ).

Proof

First observe that for I_d = [−π/2^σ, −B_d] ∪ [B_d, π/2^σ] we have

∑θd∈Θd([−Bd, Bd], θr)g(θd, θr)≥22(ℓ−σ)+γ−∑θd∈Θd(Id, θr)g(θd, θr)

as g(θ_d, θ_r) is non-negative, and as by dividing the interval

∑θd∈Θd([−π/2σ, π/2σ), θr)g(θd, θr)=∑θd∈Θd([−π/2σ,−Bd), θr)g(θd, θr)+∑θd∈Θd([−Bd, Bd], θr)g(θd, θr)+∑θd∈Θd((Bd, π/2σ), θr)g(θd, θr)=22(ℓ−σ)+γ

where we also used Lemma C.8. We hence seek an upper bound to

(C1) ∑θd∈Θd(Id, θr)g(θd, θr)=∑θd∈Θd(Id, θr)g(θd+⌈−2σd/r⌉θr/2σ,0)≤∑θd∈Θd(Id, θr)h(θd+⌈−2σd/r⌉θr/2σ)

that is independent of θ_r, where we have used Claim C.3 to obtain (C1), by introducing the function h(θ_d) = 2⁴/(2^σθ_d)² that is strictly decreasing in |θ_d|.

The situation is depicted in Figure C2, where g(θ_d, θ_r) for θ_r = 0 is plotted continuously in θ_d, for |θ_d| ≤ π in the top graph, and |θ_d| ≤ π/2^σ in the middle graph.

Figure C2

The functions g(θ_d, 0) and h(θ_d) = 2⁴/(2^σθ_d)² plotted for σ = 3, ℓ = 9 and τ = 3. The maximum cyclic shift is bounded by B_r = B_d/2.

Fixing a non-zero θ_r ∈ Θ_r([−B_r, B_r)) shifts the top and middle graphs in Figure C2 cyclically by ⌈−2^σd/r⌉ θ_r/2^σ. As |⌈−2^σd/r⌉ θ_r/2^σ|≤ |θ_r| ≤ B_r, the maximum cyclic shift in θ_d is upper-bounded by B_r, see the bottom graph in Figure C2 where g(θ_d + B_r, 0) is plotted in yellow and g(θ_d − B_r, 0) in green.

To upper-bound (C1) it therefore suffices to sum over all distinct admissible θ_d on I_r = [−π/2^σ, −B_r] ∪ [B_r, π/2^σ], as this captures all distinct admissible θ_d in the left and right tail regions under any cyclic shift.

We have that

(C1)=∑θd∈Θd(Id, θr)h(θd+⌈−2σd/r⌉θr/2σ)

(C2) ≤maxθr∈Θr([−Br, Br]) ∑θd∈Θd(Ir, θr)h(θd)

(C3) =∑θd∈Θd(Ir, 0)h(θd)=∑θd∈Θd([Br, π/2σ], 0)2h(θd)

due to symmetry, where we have maximized the set of admissible θ_d over θ_r.

Recall that by Lemma 4.4 there is one distinct admissible argument α_d on the interval [0, 2^m−γ) for a given fixed α_r. Hence there is one distinct admissible θ_d on the interval [0, 2^{−ℓ−γ+1} π) for a given fixed θ_r. All other distinct admissible θ_d spread out from the starting point, equidistantly separated by a distance of 2^{−ℓ−γ+1} π. The distinct admissible θ_d may occur with multiplicity; however all distinct admissible θ_d occur with the same multiplicity, again see Lemma 4.4.

This implies that the sum in (C2) is maximized for θ_r equal to zero, as both endpoints of the interval B_r ≤ |θ_d| ≤ π/2^σ are then admissible, maximizing both the number of distinct admissible θ_d on the interval, and the contribution from each distinct admissible θ_d as h(θ_d) is strictly decreasing in |θ_d|.

By Claim C.7, the distinct admissible θ_d may be parameterized in u_d and u_r where

θd(ud,ur)=2π ((δrur mod 2m−γ)+2m−γud)/2m+ℓ.

Now θ_r = 0 implies u_r = 0, which in turn implies 2^τ−ℓπ = B_d/2 = B_r ≤ 2π u_d/2^ℓ+γ ≤ π/2^σ, or more succinctly 2^τ+γ−1 ≤ u_d ≤ 2^{ℓ+γ−σ−1}, which yields

(C3)=∑ud=2τ+γ−12ℓ+γ−σ−12h(θd(ud,ur))=∑ud=2τ+γ−12ℓ+γ−σ−125(2σ⋅2π ud/2ℓ+γ)2=22(ℓ−σ+γ)23π2∑ud=2τ+γ−12ℓ+γ−σ−11ud2≤22(ℓ−σ+γ)23π222τ+γ−1=22(ℓ−σ)+γ25π212τ

where we have used Claim C.2 and that γ ≥ 0 and τ > 1. This implies

∑θd∈Θd([−Bd,Bd], θr)g(θd,θr)≥22(ℓ−σ)+γ−22(ℓ−σ)+γ25π212τ=22(ℓ−σ)+γ(1−25π212τ)

and so the lemma follows.

C.4.2 The outer sum over f(θ_r)

Lemma C.13

For θ_r ∈ Θ_r([−B_r, B_r], the outer sum

∑θr∈Θr([−Br, Br])f(θr)≥2m+ℓ−κr⌈2m+ℓr⌉(1−25π212τ).

Proof

First observe that for I_r = [−π, −B_r] ∪ [B_r, π] it holds that

∑θr∈Θr([−Br, Br])f(θr)≥2m+ℓ−κr⌈2m+ℓr⌉−∑θr∈Θr(Ir)f(θr)

as f(θ_r) is non-negative and

∑θr∈Θr([−π, π))f(θr)=∑θr∈Θr([−π, −Br))f(θr)+∑θr∈Θr([−Br, Br])f(θr)+∑θr∈Θr((Br, π))f(θr)

where, by Lemma C.9, the left hand sum

∑θr∈Θr([−π, π))f(θr)=2m+ℓ−κr⌈2m+ℓr⌉.

To prove the lemma, we seek an upper bound to

(C4) ∑θr∈Θr(Ir)f(θr)≤∑θr∈Θr(Ir)24θr2≤∑θr∈Θr([Br, π])25θr2

where we have used Claim C.3, that f (θ_r) is symmetric around the origin, and that the distinct admissible θ_r are equidistantly separated by a distance of 2^κ_r around the origin by Lemma 4.4. The distinct admissible θ_r may occur with multiplicity; however all distinct admissible θ_r occur with the same multiplicity.

By Claim C.7, the distinct admissible θ_r may be parameterized in u_r where θ_r(u_r) = 2π (2^κ_ru_r)/2^m+ℓ, which implies 2^τ−ℓπ = B_r ≤ 2π (2^κ_r u_r)/2^m+ℓ ≤ π, or more succinctly 2^{m+τ−κ_r−1} ≤ u_r ≤ 2^{m+ℓ−κ_r−1}, which yields

(C4)=∑ur=2m+τ−κr−12m+ℓ−κr−125(2π 2κrur/2m+ℓ)2=22(m+ℓ−κr)23π2∑ur=2m+τ−κr−12m+ℓ−κr−11ur2≤22(m+ℓ−κr)23π222m+τ−κr−1=2m+2ℓ−κr25π212τ≤2m+ℓ−κr⌈2m+ℓr⌉25π212τ

where we have used Claim C.2 and that γ ≥ 0 and τ > 1. This implies

∑θr∈Θr([−Br, Br])f(θr)≥2m+ℓ−κr⌈2m+ℓr⌉−2m+ℓ−κr⌈2m+ℓr⌉25π212τ=2m+ℓ−κr⌈2m+ℓr⌉(1−25π212τ)

and so the lemma follows.

C.4.3 Combined result

Lemma C.14

The combined sum over all distinct admissible angle pairs (θ_d, θ_r), in the central region where |θ_d| ≤ B_d and |θ_r| ≤ B_r, is

∑θr∈Θr([−Br, Br])θd∈Θd([−Bd, Bd], θr)P˜(θd,θr)≥2γ−κr r2m+ℓ⌈2m+ℓr⌉(1−25π212τ)2.

Proof

From Lemmas C.12 and C.13 it follows that

∑θr∈Θr([−Br, Br])θd∈Θd([−Bd, Bd], θr)P˜(θd,θr)=22σr22(m+2ℓ)∑θr∈Θr([−Br, Br])f(θr)∑θd∈Θd([−Bd, Bd], θr)g(θd,θr)≥22σr22(m+2ℓ) 2m+ℓ−κr⋅22(ℓ−σ)+γ⌈2m+ℓr⌉(1−25π212τ)2=2γ−κr r2m+ℓ⌈2m+ℓr⌉(1−25π212τ)2

and so the lemma follows.

C.5 Main soundness result

In this section, we combine results from the previous sections into our main soundness result.

C.5.1 Bounding the probability mass in the central region

Theorem C.15

The sum of P˜(θd,θr) over all admissible angle pairs (θ_d, θ_r), with multiplicity, in the central region where |θ_d| ≤ B_d and |θ_r| ≤ B_r, is bounded by

r2m+ℓ⌈2m+ℓr⌉(1−25π212τ)2≤∑θr∈Θr([−Br, Br])θd∈Θd([−Bd, Bd], θr)2κr−γ P˜(θd,θr)≤r2m+ℓ⌈2m+ℓr⌉.

Proof

The theorem follows by combining Lemmas C.10 and C.14.

Theorem C.15 above lower-bounds the fraction of the probability mass that is located within the central region as a function of τ. The fraction of the probability mass that falls outside the central region decreases exponentially in τ.

C.5.2 Bounding the total error in the central region

Theorem C.16

The total error when approximating P(θ_d, θ_r) by P˜(θd,θr) , as upper-bounded by summing e˜(θd,θr) over all admissible angle pairs (θ_d, θ_r), with multiplicity, in the central region where |θ_d| ≤ B_d and |θ_r| ≤ B_r, is bounded by

∑θr∈Θr([−Br, Br])θd∈Θd([−Bd, Bd], θr)2κr−γ e˜(θd,θr)≤ 2m+2τD (262σ+252ℓ)+2τ+σ+22ℓπ (1+2τ+σ2ℓπ) r2m+ℓ⌈2m+ℓr⌉

where D is the density of admissible angle pairs (θ_d, θ_r) in the region.

Proof

The error when approximating P(θ_d, θ_r) by P˜(θd,θr) is bounded by

e˜(θd,θr)=242m+σ+232m+ℓ+2σ2(|θd|+|θr|)(2+2σ2(|θd|+|θr|))P˜(θd,θr)

by Theorem 3.22. We sum e˜(θd,θr) over all admissible angle pairs (θ_d, θ_r), with multiplicity, in the region where |θ_d| ≤ B_d and |θ_r| ≤ B_r, where B_d = 2^τ−ℓ+1 π and B_r = B_d/2 by Definition C.11. This is equivalent to summing over all admissible argument pairs (α_d, α_r), with multiplicity, in the region where |α_d| ≤ 2^m+τ and |α_r| ≤ 2^m+τ−1.

As m > 0 and τ > 1 by Definition C.11, the area of this region is

(2⋅2m+τ+1)(2⋅2m+τ−1+1)=22(m+τ)+1+2m+τ+1+2m+τ+1≤22(m+τ+1)

from which it follows that the region contains at most 2^2(m+τ+1) D admissible pairs (θ_d, θ_r), where D is the density of admissible pairs with multiplicity.

If we furthermore use that |θ_d| + |θ_r| ≤ 2^τ−ℓ+2 π, this implies that

∑θr∈Θr([−Br, Br))θd∈Θd([−Bd, Bd), θr)2κr−γ e˜(θd,θr)≤22(m+τ+1)D (242m+σ+232m+ℓ)+2τ−ℓ+σ+2π (1+2τ−ℓ+σπ) r2m+ℓ⌈2m+ℓr⌉≤2m+2τD (262σ+252ℓ)+2τ+σ+22ℓπ (1+2τ+σ2ℓπ) r2m+ℓ⌈2m+ℓr⌉

where we have used that

∑θr∈Θr([−Br, Br])θd∈Θd([−Bd, Bd], θr)2κr−γ P˜(θd,θr)≤r2m+ℓ⌈2m+ℓr⌉

by Theorem C.15, and so the theorem follows.

By Lemmas 4.7 and 4.8, the density D of admissible argument pairs (α_d, α_r), or equivalently angle pairs (θ_d, θ_r), when accounting for multiplicity, in the region is approximately 2^−m for random problem instances. Asymptotically, the density tends to 2^−m as m tends to infinity for fixed s.

Furthermore, the density is exactly 2^−m in rectangular regions of the plane of side length multiples of 2^m−γ and 2^m−γ+κ_r by Lemma 4.9. The region in Theorem C.16 above may be adapted to meet these requirements.

To understand the implications of the above theorem for the upper bound on the total approximation error in the central region, it remains to select σ to minimize the bound.

C.5.3 Selecting σ to minimize the bound on the total error in the central region

To select the integer parameter σ on 0 < σ < ℓ so as to minimize the bound on the total error given in Theorem C.16, we first approximate the error bound by

22τ+62σ︸ϵ1+22τ+52ℓ︸ϵ2+2τ+σ+22ℓπ︸ϵ3+(122τ+σ+22ℓπ)2︸ϵ4

where we have used that D ≈ 2^−m and (r/2^m+ℓ) ⌈2^m+ℓ/r⌉ ≈ 1, with equality in the limit as m tends to infinity for fixed s. The approximation is only good when all error terms are less than one, so the term ɛ₃ is greater than ɛ₄ = (ɛ₃/2)². As ɛ₂ does not depend on σ, we hence seek to select σ to equate ɛ₁ and ɛ₃. This yields

22τ+62σ=2τ+σ+22ℓπ ⇒ σ=⌊12(ℓ+τ+4−log2π)⌉.

If σ is fixed accordingly, the error bound obtained by summing e˜(θd,θr) analytically over all admissible angle pairs (θ_d, θ_r), with multiplicity, in the region where |θ_d| ≤ B_d and |θ_r| ≤ B_r, is heuristically minimized. For this σ, the two main error terms

(C5) ϵ1≈ϵ3≈23τ/2+42ℓ/2π.

For as long as 23τ/2+4π is much smaller than 2^ℓ/2, we heuristically expect the upper bound on the total error given in Theorem C.16 to be negligible.

C.5.4 Asymptotic soundness results

Theorem C.17

For fixed s and τ, and σ = ⌊(ℓ + τ + 4 − log₂ π)/2⌉, the sum of P˜(θd,θr) over all admissible angle pairs (θ_d, θ_r), with multiplicity, in the central region where |θ_d| ≤ B_d and |θ_r| ≤ B_r, is bounded by

(C6) (1−25π212τ)2≤limm→∞ ∑θr∈Θr([−Br, Br])θd∈Θd([−Bd, Bd], θr)2κr−γ P˜(θd,θr)≤1

in the limit as m tends to infinity. The error |P(θd,θr)−P˜(θd,θr)|≤e˜(θd,θr) and the sum of e˜(θd,θr) over all admissible angle pairs (θ_d, θ_r), with multiplicity, in the central region tends to

(C7) limm→∞ ∑θr∈Θr([−Br, Br))θd∈Θd([−Bd, Bd), θr)2κr−γ e˜(θd,θr)=0

in the same limit.

Proof

The bound in (C6) follows immediately by taking the limit, as m tends to infinity for fixed s and τ, of the bound given in Theorem C.15. Analogously (C7) follows by taking the limit, as m tends to infinity for fixed s and τ, and for σ as in the formulation of this theorem, of Theorem C.16, where D tends to 2^−m in the limit by Lemma 4.8, and so the theorem follows.

The above theorem states that an arbitrarily large constant fraction of the probability mass may be captured asymptotically by expanding the region in τ.

As the bound on the error when approximating P(θ_d, θ_r) by P˜(θd,θr) in the region tends to zero asymptotically, P˜(θd,θr) equals P(θ_d, θ_r) asymptotically in the region. Furthermore, all probability mass is in the region asymptotically when τ tends to infinity with m at a moderated rate. This implies that P˜(θd,θr) asymptotically captures the probability distribution completely and exactly. Corollary C.18 below formalizes these observations:

Corollary C.18

For fixed s, for τ = ⌊ℓ/6⌉ and σ = (ℓ + τ + 4 − log₂ π)/2, the sum of P˜(θd,θr) over all admissible angle pairs (θ_d, θ_r), with multiplicity, in the central region where |θ_d| ≤ B_d and |θ_r| ≤ B_r, tends to

(C8) limm→∞∑θr∈Θr([−Br, Br])θd∈Θd([−Bd, Bd], θr)2κr−γ P˜(θd,θr)=1

in the limit as m tends to infinity. The error |P(θd,θr)−P˜(θd,θr)|≤e˜(θd,θr) and the sum of e˜(θd,θr) over all admissible angle pairs (θ_d, θ_r), with multiplicity, in the central region tends to

(C9) limm→∞∑θr∈Θr([−Br, Br))θd∈Θd([−Bd, Bd), θr)2κr−γ e˜(θd,θr)=0

in the same limit.

Proof

The bound in (C8) follows immediately by taking the limit as m tends to infinity for fixed s, and for τ as in the formulation of this corollary, of the bound given in Theorem C.15. Analogously (C9) follows by taking the limit, as m tends to infinity for fixed s, and for σ and τ as in the formulation of this corollary, of Theorem C.16, where D tends to 2^−m in the limit by Lemma 4.8. This is easy to see, as the two main error terms ɛ₁ and ɛ₃ in (C5) tend to

limm→∞23⌊ℓ/6⌉/2+42ℓ/2π=limm→∞2ℓ/4+42ℓ/2π=limm→∞242ℓ/4π=0,

where we may remove the rounding operation in the limit, and as the requirement that 1 < τ < ℓ − σ − 1 in Definition C.11 is respected in the limit. Furthermore ɛ₄ < ɛ₃ in the limit, and it is easy to see that ɛ₂ tends to zero in the limit. The corollary follows from this analysis.

D Marginal distributions

By using results and notation from the soundness analysis in Appendix C, we may immediately derive a closed-form expression for the marginal distribution that arises when summing P˜(θd,θr) over all admissible angles θ_d with multiplicity.

Lemma D.1

For θ_r ∈ Θ_r([−π, π]), the marginal probability distribution that arises when summing P˜(θd,θr) over all θ_d ∈ Θ_d([−π/2^σ, π/2^σ), θ_r) is

∑θd∈Θd([−π/2σ, π/2σ), θr)2κr−γ2κr P˜(θd,θr)=r22(m+ℓ)|∑nr=0⌈2m+ℓ/r⌉−1eiθrnr|2

when accounting for multiplicity.

Proof

By Lemma C.8 we have that

∑θd∈Θd([−π/2σ, π/2σ), θr)P˜(θd,θr)=22σr22(m+2ℓ) f(θr)∑θd∈Θd([−π/2σ, π/2σ), θr)g(θd,θr)=2γr22(m+ℓ) f(θr)=2γr22(m+ℓ) |∑nr=0⌈2m+ℓ/r⌉−1eiθrnr|2

from which the lemma follows, as the pairs (θ_d, θ_r) occur with multiplicity 2^κ_r−γ by Lemma 4.4, and the angles θ_r with multiplicity 2^κ_r by Lemma A.4.

The above expression for the marginal probability distribution is derived from the approximation P˜(θd,θr) . It corresponds to the exact expression derived in Appendix A for the order-finding algorithm with tradeoffs. Note that there are minor differences between the two expressions. These are explained by P˜(θd,θr) being an approximation to P(θ_d, θ_r), whilst the expression in Appendix A is exact.

Another way to understand why this correspondence arises is to observe that when using qubit recycling, j may first be computed, after which k may be computed. At the point in time when j has been computed and read out, but the computation of k has not yet begun, we will have executed Shor's or Seifert's order-finding algorithms. Hence j, α_r and θ_r are distributed as in these algorithms.

A closed-form analytical expression for the marginal distribution that arises when summing over all admissible θ_r is seemingly less straightforward to derive. Numerically, the marginal distribution may however be seen to correspond to that for short logarithms as stated in Section 5.2.

Received: 2020-02-08

Accepted: 2020-11-20

Published Online: 2021-04-22

This work is licensed under the Creative Commons Attribution 4.0 International License.

Artikel in diesem Heft

https://doi.org/10.1515/jmc-2020-0006

Schlagwörter für diesen Artikel

Discrete logarithms; Order finding; Shor's algorithms

Creative Commons

BY 4.0

Quantum algorithms for computing general discrete logarithms and orders with tradeoffs

Artikel

Abstract

1 Introduction

1.1 Earlier works

1.2 Our contributions

1.2.1 On the cryptographic significance of this work

1.2.2 Further details and overview

1.2.3 Structure of this paper

1.3 Notation

1.4 Randomization

2 The quantum algorithm

3 The probability of observing (j, k) and y

3.1 Preliminaries

3.1.1 Constructive interference

3.2 Closed-form approximation with error bounds

Definition 3.1

Definition 3.2

Definition 3.3

Claim 3.4

Proof

3.2.1 Preliminaries

Claim 3.5

Proof

Claim 3.6

3.2.2 Bounding |sℛ|

Lemma 3.7

Proof

Claim 3.8

Proof

3.2.3 Approximating Sℛ by S𝒜T𝒜

Definition 3.9

Definition 3.10

Claim 3.11

Proof

Definition 3.12

Lemma 3.13

Proof

Lemma 3.14

Proof

3.2.4 Approximating S𝒜 by SA′

Definition 3.15

Lemma 3.16

Proof

Lemma 3.17

Proof

Lemma 3.18

Proof

3.2.5 Approximating SA′ by S SA″

Definition 3.19

Lemma 3.20

Proof

Lemma 3.21

Proof

3.2.6 Main approximability result

Theorem 3.22

Proof

4 The distribution of pairs (αd, αr)

Definition 4.1

Definition 4.2

Definition 4.3

Lemma 4.4

Proof

4.1 Pairs (j, k) yielding (αd, αr)

Lemma 4.5

Proof

4.2 The density of pairs (αd, αr)

Claim 4.6

Proof

Lemma 4.7

Proof

Lemma 4.8

Proof

Lemma 4.9

Proof

5 Simulating the quantum algorithm

5.1 Constructing the histogram

5.2 Understanding the probability distribution

5.3 Sampling the probability distribution

6 The classical post-processing algorithms

3.2.2 Bounding |s_ℛ|

3.2.3 Approximating S_ℛ by S_𝒜T_𝒜

3.2.4 Approximating S_𝒜 by SA′

4 The distribution of pairs (α_d, α_r)

4.1 Pairs (j, k) yielding (α_d, α_r)

4.2 The density of pairs (α_d, α_r)

C.3.1 The inner sum over g(θ_d, θ_r)

C.3.2 The outer sum over f (θ_r)