Startseite Revisiting structure graphs: Applications to CBC-MAC and EMAC
Artikel Open Access

Revisiting structure graphs: Applications to CBC-MAC and EMAC

  • Ashwin Jha EMAIL logo und Mridul Nandi
Veröffentlicht/Copyright: 8. November 2016
Veröffentlichen auch Sie bei De Gruyter Brill

Abstract

In [2], Bellare, Pietrzak and Rogaway proved an O(q2/2n) bound for the PRF (pseudorandom function) security of the CBC-MAC based on an n-bit random permutation Π, provided <2n/3. Here an adversary can make at most q prefix-free queries each having at most many “blocks” (elements of {0,1}n). In the same paper an O(o(1)q2/2n) bound for EMAC (or encrypted CBC-MAC) was proved, provided <2n/4. Both proofs are based on structure graphs representing all collisions among “intermediate inputs” to Π during the computation of CBC. The problem of bounding PRF-advantage is shown to be reduced to bounding the number of structure graphs satisfying certain collision patterns. In the present paper, we show that [2, Lemma 10], stating an important result on structure graphs, is incorrect. This is due to the fact that the authors overlooked certain structure graphs. This invalidates the proofs of the PRF bounds. In [31], Pietrzak improved the bound for EMAC by showing a tight bound O(q2/2n) under the restriction that <2n/8. As he used the same flawed lemma, this proof also becomes invalid. In this paper, we have revised and sometimes simplified these proofs. We revisit structure graphs in a slightly different mathematical language and provide a complete characterization of certain types of structure graphs. Using this characterization, we show that PRF security of CBC-MAC is about σq/2n provided <2n/3 where σ is the total number of blocks in all queries. We also recover tight bound for PRF security of EMAC with a much relaxed constraint (<2n/4) than the original (<2n/8).

1 Introduction

Brief history on CBC and EMAC.

The notion of authentication in cryptographic protocols was first introduced by Diffie and Hellman in their seminal paper [8] of 1976. In symmetric key settings, this need is fulfilled by message authentication codes, better known as MACs. CBC-MAC is a block cipher based MAC construction which is based on the CBC mode of operation invented by Ehrsam et al. [12]. The CBC-MAC was an international standard [16] which was proven to be secure for fixed length messages [1, 3] or prefix-free message spaces [30, 15]. The fixed length constraint is not desired in practice. One way to circumvent this is to use the length of message as the first block in CBC computation. This requires prior knowledge of the message length. A more reasonable and popular approach is to encrypt the CBC output with an independent keyed permutation. This later approach is called the EMAC which has been proved to be secure without any restrictions on the message [30]. We refer readers to Section 2 for a brief overview of literature related to CBC-MAC.

CBC and EMAC functions.

Throughout the paper, we fix a positive integer n and let :={0,1}n. Elements of these sets are called blocks. Let Perm:=Perm(n) be the set of all permutations over . The CBC (cipher block chaining) function with key πPerm, denoted by CBCπ, takes as input a message M=(M[1],,M[m])m and outputs a block outπ(M)[m] which is inductively computed as outπ(M)[0]=0n and

outπ(M)[i]=π(outπ(M)[i-1]M[i]),i=1,,m.

For 0<i<m, inπ(M)[i]:=outπ(M)[i-1]Mi and outπ(M)[i] are said to be the intermediate input and output, respectively. Figure 3 in Section 3.6 provides an illustration of CBC computation and intermediate values.

Security definitions.

In this paper, we consider two types of attacks for an adversary which makes queries of at most blocks: 𝖺𝗍𝗄=𝗉𝖿 and 𝖺𝗍𝗄=𝖺𝗇𝗒 mean no query is a prefix of another and the queries are arbitrary distinct strings, respectively. Let 𝐀𝐝𝐯F𝖺𝗍𝗄(q,,σ) denote the maximum advantage attainable by any adversary making q queries and the total number of blocks in all q queries is at most σ, mounting an 𝖺𝗍𝗄 attack, in distinguishing whether its oracle is F or a random function that outputs n bits.[1] To analyze the security of CBC and EMAC for the random permutation Π, the collision probability and full collision probability,

𝐂𝐏n(M1,M2):=𝖯𝗋Π[CBCΠ(M1)=CBCΠ(M2)],
𝐅𝐂𝐏n(M1,M2):=𝖯𝗋Π[outΠ(M2)[m2]=outΠ(Mr)[j];(r,j)(2,m2)],

were introduced for distinct messages M1 and M2 with lengths m1 and m2, respectively. Moreover, let 𝐂𝐏2,𝖺𝗍𝗄 and 𝐅𝐂𝐏2,𝖺𝗍𝗄 denote the maximum collision and full collision probabilities, respectively, where the maximum is taken over all distinct messages M,M having at most blocks and satisfies 𝖺𝗍𝗄. In [2], the following results were shown:

𝐀𝐝𝐯EMAC𝖺𝗇𝗒(q,)(q2)(𝐂𝐏2,𝖺𝗇𝗒+2-n),𝐀𝐝𝐯CBC𝗉𝖿(q,)q2(𝐅𝐂𝐏2,𝗉𝖿+4/2n).

As EMAC encrypts output of CBC-MAC under an independent key, as long as there is no collision in the output of CBC-MAC, the final output behaves randomly. This is essentially the same as the Carter–Wegman construction [33]. The CBC-MAC function can be similarly viewed as a (dependent) nested construction in which the final encryption is computed under the same key as the internal computation. This is why we need an extended definition of collision which is appropriately captured by the full collision event. Thus, bounding PRF advantages are reduced to bounding (full) collision probabilities. These are again reduced to bounding the number of structure graphs as described in the following paragraph.

Structure graph.

A block-vertex structure (BS) graph𝐆 with vertex set V{0,1}n associated to a message M and a permutation π is the directed edge-labeled graph induced by the edge set E consisting of all edges

ei:outπ(M)[i-1]outπ(M)[i]:=(outπ(M)[i-1],outπ(M)[i]),1im.

The label for ei is (ei)=M[i]. Note that a BS graph can be simply viewed as an M-walk. Informally, an M-walk is nothing but the walk generated by the message M starting at 0n. In this paper we often use this equivalent representation of BS graphs. A structure graph𝐆* over a vertex set V* (an index set) is an isomorphic graph of the BS graph G mapping 0n to 0. The labeled walk of G is preserved in G* (in isomorphism sense) due to the isomorphism between G and G*. So we can have a similar representation of a structure graph in terms of walks. We refer readers to Definition 1 for a more formal definition of a structure graph. The (block-vertex) structure graph is also similarly defined for a tuple (or sometimes pair) of messages =(M1,,Mq).

Given a structure graph G*=(V*,E*), suppose we reconstruct the graph by defining edges one by one along the M-walk. Now there are three possibilities at any point of time: (1) we add a new edge heading to a new vertex (not obtained so far), (2) we get an old edge which is already defined, and (3) we add a new edge heading to an already existing vertex. True collisions correspond to the last case. The number of such true collision can be equivalently defined as the following sum:

𝐓𝐂(G):=in-deg(0n)+vV{0}(in-deg(v)-1).

Let us assign a variable Yv, meant for the intermediate output, for each node vV*. Let δ:=(u,v;z) be a triple such that uz, vz and uv. We call such triple input-collision (also called collision). Given any such input-collision the following linear equation, denoted by Lδ, must hold whenever Y-variables are actually assigned as intermediate outputs:

YvYu=cδ,where cδ=(v,z)(u,z).

When 0 has no in-degree, the accident of a structure graph G*, denoted by 𝐀𝐜𝐜(G), is the rank of all linear equations Lδ over all collisions of the graph. When 0 has positive in-degree, we add one to the rank to define the accident. In Section 5, we provide a more detailed study on the structure graph.

A flaw in [2, Lemma 10].

Lemma 10 of [2] states that for any structure graph G* realized by a pair of messages 𝐀𝐜𝐜(G*)=1 implies 𝐓𝐂(G*)=1. This result has been used to bound 𝐅𝐂𝐏n (in [2]) as well as 𝐂𝐏n (in [2, 31]). Unfortunately, the claim is incorrect as illustrated in Figure 1 where we have two structure graphs with true collision 2 and accident 1. Surprisingly, this flaw remained unobserved till now, although it has been applied for other results.

Figure 1 Counter-example for [2, Lemma 10]. The walks corresponding to the two messages start at v1${v_{1}}$, and end at v3${v_{3}}$. Here M2⁢[1]:=M1⁢[1]⊕M1⁢[2]⊕M1⁢[3]${M_{2}[1]:=M_{1}[1]\oplus M_{1}[2]\oplus M_{1}[3]}$ and *${*}$ can be any number of blocks. In particular, when *${*}$ has no block, the figures in (a) and (b) are identical. In (a) we have two input-collisions δ1:=(v1,v2;v2′)${\delta_{1}:=(v_{1},v_{2};v_{2}^{\prime})}$ and δ2:=(v1,v2;v3)${\delta_{2}:=(v_{1},v_{2};v_{3})}$. The two linear equations Lδ1${L_{\delta_{1}}}$ and Lδ2${L_{\delta_{2}}}$ corresponding to the two input-collisions are the same as Yv1⊕Yv2=M1⁢[1]⊕M1⁢[2]${Y_{v_{1}}\oplus Y_{v_{2}}=M_{1}[1]\oplus M_{1}[2]}$ and so the rank of all collisions
(which is also the accident) is one. However, true collision is two (at v2′${v_{2}^{\prime}}$ and v3${v_{3}}$), which contradicts [2, Lemma 10]. A similar argument can be given for (b).
Figure 1

Counter-example for [2, Lemma 10]. The walks corresponding to the two messages start at v1, and end at v3. Here M2[1]:=M1[1]M1[2]M1[3] and * can be any number of blocks. In particular, when * has no block, the figures in (a) and (b) are identical. In (a) we have two input-collisions δ1:=(v1,v2;v2) and δ2:=(v1,v2;v3). The two linear equations Lδ1 and Lδ2 corresponding to the two input-collisions are the same as Yv1Yv2=M1[1]M1[2] and so the rank of all collisions (which is also the accident) is one. However, true collision is two (at v2 and v3), which contradicts [2, Lemma 10]. A similar argument can be given for (b).

1.1 Our contributions

The flaw in [2, Lemma 10] is an important observation that affects several results [2, 14, 35, 31] based on it. Naturally, the next course of action should be to study the impact of this flaw to those results in addition to [2], where it has been applied. This work serves this purpose. To our best knowledge, it has been applied in [31] and probably in [9, Lemma 3] (no proof of this claim is publicly available though). The bound on 𝐅𝐂𝐏2,𝗉𝖿 (see [2]) is also used in the PRF analysis of truncated CBC [14]. Any revision in the 𝐅𝐂𝐏2,𝗉𝖿 bound [2] will also necessitate revision of bound in [14].

Characterization of all accident-one structure graphs.

As [2, Lemma 10] is wrong and we have identified two graphs which violate this lemma, it is important to see whether there are any more missing cases. We first settle this issue and show that these are the only missing cases. To do so, we characterize all structure graphs (realized by a single message) having at most accident 1 (see Lemma 1). This will actually help when we study structure graphs for two messages.

Revision of the 𝐂𝐏 and 𝐅𝐂𝐏, and PRF bound of CBC.

We revise the 𝐅𝐂𝐏 bound of [2]. Fortunately, the upper bounds of 𝐅𝐂𝐏, and hence PRF advantage of CBC, are only increased by a constant factor keeping the order of the bound unchanged (see Section 7). In case of the 𝐂𝐏 bound due to Bellare, Pietrzak and Rogaway, their [2, Lemma 15] used to bound the main claim is false. Fortunately, it can be shown that the main claim remains true after revision.

Revision of the PRF bound of EMAC.

We revisit Pietrzak’s [31] proof of the PRF bound for EMAC in Section 8. Unfortunately, a straightforward revision gives a non-tight bound on EMAC. Then, we take a different approach (by considering a different bad event) to show in Theorem 2 the tight bound for EMAC. Our approach is much simpler and gives the tight bound even for a more relaxed choice of , namely <2n/4, whereas the original constraint was <2n/8.

Other consequences.

The 𝐂𝐏 and 𝐅𝐂𝐏 bounds in [2] have been used in multiple subsequent works. For instance, Gaži, Pietrzak and Tessaro [14] applied this result to bound the probability of a bad event in PRF analysis of truncated CBC. Similarly, Yasuda [35] used the 𝐂𝐏 bound of [2]. Fortunately, the proofs in these works hold with small changes in constant factors. Since these changes are minor, in this paper we concentrate solely on [2, 31].

2 Related works

The security of MAC constructions has seen constant research interest. Among the block cipher based constructions CBC-MAC and its variants are the most popular. Here we try to summarize the research on PRF security of CBC-MAC and its variants. The aim is to list the state of the art results as well as emphasize the progress that has been made till date.

Analysis of CBC-MAC.

First concrete results on CBC-MAC were given by Bellare, Kilian and Rogaway [1]. They showed a bound of 22q2/2n for fixed length queries, which was further improved to 2q2/2n by Maurer [22]. Later Bernstein [3] simplified the proof for fixed-length CBC-MAC. Petrank and Rackoff [30] extended the proof in [1] to prefix-free queries, and a similar extension on Bernstein’s proof was done by Rackoff and Gorbunov [15]. Both bounds are about 2q2/2n. The most recent bound on CBC-MAC is by Bellare, Pietrzak and Rogaway [2] who improved (in terms of ) the bound to 12q2/2n+644q2/22n. Another way of improving the bound is to show the PRF bound of the form qσ/2n (see [26]).

Analysis of EMAC.

In [1], Bellare, Kilian and Rogaway also suggested some variants of CBC-MAC to handle variable length messages. In particular, they mentioned a construction where the output of CBC-MAC is further encrypted by an independent key. This construction known as EMAC was first developed during the RACE project [4]. Petrank and Rackoff [30] proved that DMAC (same as EMAC) is secure up to 2.52q2/2n. Bellare, Pietrzak and Rogaway [2] improved the bound to q2d()/2n which was further improved by Pietrzak [31] to q2/2n for 2n/8. However, the proof of the later result is invalid due to the flaw that we discussed earlier. A result on 𝐂𝐏2,𝖾𝗊 stated in [9] also gives a tight bound of O(q2/2n) for equal length messages.

Analysis of variants of CBC-MAC and EMAC.

Although the EMAC construction is tolerant to variable length messages it has a domain limited to +. Black and Rogaway [6] introduced three refinements to EMAC, viz., ECBC, FCBC and XCBC to allow use of variable block length strings. They showed that ECBC and FCBC are secure up to 2.5σ2/2n, and that the bound on XCBC is 3.75σ2/2n. Jaulmes, Joux and Valette [19] gave a randomized version of EMAC which they called RMAC and proved that the construction resists birthday attacks. However the proof seems to be incorrect (as suggested in [2]). Other excellent variants of CBC-MAC are TMAC [21], OMAC [17] and GCBC [24]. A variant of OMAC, namely OMAC1 is equivalent to CMAC which became an NIST recommendation [11] in 2005. Another design approach is the PMAC construction proposed by Black and Rogaway [5] which is inherently parallel. In [23, 18, 27, 25], the improved bounds for XCBC, TMAC, PMAC and OMAC are shown in the form of O(q2/2n), O(σ2/2n) and O(σq/2n). Apart from these specific constructions Jutla [20] suggested a general class of DAG-based PRF constructions.

Beyond birthday bound (BBB) security.

Another direction of research is BBB security, where the aim is to achieve more than n/2-bits security in σ. Among the block cipher based BBB secure MACs, PMAC_Plus [36] and 3kf9 [37] are two efficient candidates. Both these candidates are three-key constructions. Recently, Dutta et al. [7] proposed a one-key candidate named 1kf9, which also offers beyond birthday security of 3kf9.

Structure graph analysis.

Structure graphs are the basic tool for analyzing sequential construction based on random permutation as evident from the work on CBC based MACs [2, 31, 14] and 1kf9 [7]. Although structure graphs have been mainly used in analysis of random permutation based constructions, they have also found application in random function based construction as evident from the analysis of NI MAC by Gaži, Pietrzak and Rybár [13] and the one key compression function based MAC by Dutta, Nandi and Paul [10]. From our observation these later works [13, 7, 10] are free from the flaw that we observed for [2, 31].

3 Preliminaries

Basic notation.

Throughout the paper, we fix a positive integer n. Let Perm be the set of all permutations on :={0,1}n. Elements of are called blocks. For any two integers ab, we write [a..b] (or simply [b], when a=1) to denote the set {a,a+1,,b}. Let ϕ be a property defined for the elements of S. We define the subset

S[ϕ]:={xS:x satisfies ϕ}.

The above set will appear in this paper many times for different choices of S and ϕ. Let

𝐏(m,k):=m(m-1)(m-k+1)

denote the k-permutations of m.

3.1 Notation on sequences

Let and S be two sets. A S-sequence x over the index set is denoted as (x[α])α where x[α]S for all α. The length of the sequence is ||, the size of the index set. In this paper we mostly consider block sequences, i.e. S=. When the index set is [a..b], we also write the sequence as a tuple or vector x[a..b]:=(x[a],,x[b]). Sometimes, by abusing notation, x also represents the set {x[α]:α}. Similarly x[a..b] represents {x[α]:α[a..b]}. We write #x to denote the number of distinct elements in the sequence x. We write S+ and S:=iSi to represent the set of all S sequences of positive and finite length, and of length at most , respectively. Now we define an equivalence relation that captures the equalities among the elements of the sequence x.

Definition 1

Given a sequence x over an index set , we define an equivalence relation x over the index set as follows: αxβ if x[α]=x[β].

Let ρ:𝒟. Let x and y be, respectively, 𝒟- and -sequences over an index set . We write x𝜌y to mean that ρ(x[α])=y[α] for all α and we simply say that ρ multi-maps x to y. This is a property of function ρ. When 𝒟=, the subset Perm[x𝜋y] represents the set of all permutations π multi-mapping x to y. We say that (x,y) is permutation compatible if there exists a permutation π such that x𝜋y. It is easy to see that (x,y) is permutation compatible if and only if x=y.

3.2 Notation on strings

We call an alphabet and its elements will be referred to as letters. A string over the alphabet is an element of *. We can also say that a string is a finite concatenation S:=a1a2a where ai. Note that the elements of are also strings. We can also view strings as -sequences over an index set . The length of a string S, denoted by |S|, is defined as the total number of letters in it. Note that for an empty string the length will be 0 as it does not have any letters in it. For a string S=XY, X (respectively Y) is said to be a prefix (respectively suffix) of S. We write X<1S if X is a prefix of S. We write X<2S if X[1..x-1]<1S but X[x]S[s], where x=|X| and s=|S|. For two strings S1 and S2 of lengths s1 and s2, respectively, a non-negative integer p:=𝖫𝖢𝖯(S1;S2) (respectively s:=𝖫𝖢𝖲(S1;S2)) is called the index of the largest common prefix (respectively largest common suffix), if S1[1..p]=S2[1..p] and S1[p+1]S2[p+1] (or S1[s..s1]=S2[s..s2] and S1[s-1]S2[s-1]).

3.3 Basic definitions and notation of graph

Directed edge-labeled graph.

A directed edge-labeled graph is a pair G:=(V,E) with EV×V×L where V is the set of vertices, L is the set of edge labels, and E is the set of edges along with their corresponding labels. In this paper we will consider only those directed edge-labeled graphs where for each pair of vertices u,vV there exists at most one label aL with ((u,v);a)E. We also write u𝑎v to mean that ((u,v);a)E.

Convention.

By abusing notation, E also denotes the set of unlabeled edges and the label a of the edge e:=(u,v) is expressed as G(e) (this notation makes sense as there is a unique choice of the label for an edge) or simply (e) whenever the graph is understood.

For an edge e:=(u,v), vertex u is called a predecessor of v, and v a successor of u. An edge (u,v) is called a loop if u=v. We define two sets:

  1. The predecessor set of a vertex v is 𝗇𝖻𝖽(*v):={u:(u,v)E}.

  2. The successor set of v is 𝗇𝖻𝖽(v*):={u:(v,u)E}.

The sizes of the predecessor and successor sets of v are called in-degree and out-degree, respectively. We implicitly assume that no vertex has both in-degree and out-degree 0. So the vertex set and hence the graph without the edge labels is uniquely determined by the edge set.

Definition 2

A walk of length s is defined as a vertex sequence w:=(w[0],,w[s]) such that w[i-1]w[i] for all i[s]. We define the label of the walk as (w):=(a1,,as) where ai=(w[i-1],w[i]), i[s].

Since a walk is a V-sequence over the index set {0,1,,s}, we define a subwalk w[a..b]:=(w[a],,w[b]) where 0abs.

When all vertices of a walk sequence are distinct, we call it a path. When all vertices w[0],,w[s-1] are distinct and w[s]=w[0], then we call it a cycle. Other special examples of walks, which will be studied later in the paper, are ρ walks and ρ walks.

A ρ walk is a walk w:=(w[0],,w[s]) such that for some 0i<js, w[0..j-1] is a path, w[j]=w[i] and for all j<ks, w[k]=w[i+r] where 0r<(j-i) and (k-r) is a multiple of (j-i). It is illustrated in Figure 2 (a). In words, a ρ walk comes back to one previous vertex (which makes a cycle) and afterwards it remains in the cycle.

A ρ walk is an extension of a ρ walk that leaves the cycle and does not come back. It is illustrated in Figure 2 (b). Note that the lengths of the subwalks labeled with * can be zero.

A directed edge-label graph G=(V,E) is called a function graph if for all vV, there do not exist two distinct successors v1 and v2 of v with G(v,v1)=G(v,v2). In other words, for every vertex v and any label a we can find at most one successor w for which the label of the edge (u,v) is a. This observation can be extended for a walk in a function graph G as follows:

w1[0]=w2[0],(w1)=(w2)w1=w2.

So if there is a walk with label M, then it must be unique and we call such a walk M-walk.

Figure 2 The graphs corresponding to ρ and ρ′$\rho^{\prime}$ walks.
Note that the lengths of the parts labeled with *$*$ can be zero.
Figure 2

The graphs corresponding to ρ and ρ walks. Note that the lengths of the parts labeled with * can be zero.

3.4 PRF advantage of a keyed function

If S is a finite set, then x$S denotes the uniform random sampling of x from S. Let 𝒟+ be a finite set. A random function from 𝒟 to is 𝖱𝖥(𝒟)$Func(𝒟,), the set of all functions from 𝒟 to . When the domain 𝒟 is understood, we simply write the random function as 𝖱𝖥.

Definition 3

Let F be a keyed function from 𝒟 to with a finite key space 𝒦. We define the prf-advantage (or pseudorandom function advantage) of an adversary A against F as

𝐀𝐝𝐯Fatk(A):=|𝖯𝗋[AFK=1:K$𝒦]-𝖯𝗋[A𝖱𝖥=1]|.

The maximum prf-advantage of F is defined as

𝐀𝐝𝐯Fatk(q,,σ)=maxA𝐀𝐝𝐯Fatk(A),

where the maximum is taken over all adversaries A making at most q queries from the domain 𝒟, say M1,,Mq with Mimi, such that imiσ and maximi. Note that 𝐚𝐭𝐤=𝐩𝐟 means none of the query is a prefix of another; 𝐚𝐭𝐤=𝐞𝐪 means the queries are of equal length; and 𝐚𝐭𝐤=𝐚𝐧𝐲 means all queries are arbitrary distinct strings. This is an information theoretic definition and we allow an unbounded time adversary. There is no loss to assume that A always makes exactly q distinct queries, represented by a sequence, say =(M1,,Mq). In this case, for any T=(T1,,Tq)q, we have

𝖯𝗋𝖱𝖥[𝖱𝖥T]=2-nq.

3.5 Coefficient-H technique

Let A be an adversary which makes q distinct queries (possibly adaptive) to F. Let the queries be x1,,xq and the corresponding F outputs y1,,yq. Let view(AF) denote the q-tuple of pairs ((x1,y1),,(xq,yq)) where xi denotes the i-th query and yi is the corresponding response.

For any q-tuple of pairs τ=((x1,y1),,(xq,yq)), the probability

F(τ):=𝖯𝗋F[(x1,,xq)𝐹(y1,,yq)]

is called the interpolation probability, where the probability is taken under the randomness of F’s key. Here we assume that F is stateless and so the above probability is independent of the order of the pairs.

Theorem 4

Theorem 4 (Coefficient-H technique)

Let Tgood be some set of q-tuples of pairs. Suppose the interpolation probability for a (stateless) oracle O follows the inequality

𝒪(τ)(1-ϵ)𝖱𝖥(τ)=(1-ϵ)2-nqfor all τ𝒯good.

Then, for any adversary A we have

𝐀𝐝𝐯Fatk(A)ϵ+𝖯𝗋[view(A𝖱𝖥)𝒯good].

This technique was first introduced by Patarin in his PhD thesis [28] (as mentioned in [32]). The proof of this theorem can be found in [29]. So we skip the proof. We use this theorem to bound the PRF advantage of CBC function defined in the next subsection.

3.6 CBC-MAC and EMAC functions based on permutations

CBC function.

The CBC (cipher block chaining) function (see Figure 3) with an oracle πPerm, viewed as a key of the construction, takes as input a message M=(M[1],,M[m])m with m blocks and outputs

CBCπ(M):=𝗈𝗎𝗍π(M)[m].

This is inductively computed as follows: 𝗈𝗎𝗍π(M)[0]=0n and

𝗈𝗎𝗍π(M)[i]=π(𝗂𝗇π(M)[i]),𝗂𝗇π(M)[i]=𝗈𝗎𝗍π(M)[i-1]M[i],i[m].

We call 𝗂𝗇π(M) and 𝗈𝗎𝗍π(M)intermediate input and output vectors, respectively, associated to π. Note that the intermediate input vector 𝗂𝗇π is uniquely determined by 𝗈𝗎𝗍π (and does not depend on the permutation π). We can write down this association generically as a function 𝗈𝗎𝗍𝟤𝗂𝗇M:mm mapping any block vector y to a block vector x where x[1]=M[1] and x[i]=y[i-1]M[i] if 1<im. So for all permutations π𝖯𝖾𝗋𝗆, we have 𝗈𝗎𝗍𝟤𝗂𝗇(𝗈𝗎𝗍π)=𝗂𝗇π.

Figure 3 CBC function and its intermediate values.
Figure 3

CBC function and its intermediate values.

EMAC function.

The EMAC function (E for encrypted) is derived from the CBC function by additionally encrypting the output with another permutation πPerm. Formally, EMACπ,π(M):=π(CBCπ(M)).

4 PRF analysis of CBC and EMAC

In this section we quickly recall the PRF analysis of CBC and EMAC as done in [2, 31]. Here CBC is based on a uniform random permutation Π chosen uniformly from Perm and EMAC is based on two independent random permutations Π and Π. In this section we reduce the bounding PRF advantages of CBC and EMAC to the full bounding collision and collision probability, respectively. We use the coefficient-H technique rather than the game playing technique used in [2].

4.1 PRF advantage of EMAC

Let M1 and M2 be two distinct tuples of blocks. Let 𝖼𝗈𝗅𝗅π(M1;M2) denote the event that CBCπ(M1)=CBCπ(M2), we call it the collision event for a pair of messages M1 and M2. We similarly define the collision event for a tuple of q2 distinct messages =(M1,,Mq) as

𝖼𝗈𝗅𝗅π()=ij𝖼𝗈𝗅𝗅π(Mi;Mj).

We define the collision probability as 𝐂𝐏n()=𝖯𝗋[𝖼𝗈𝗅𝗅Π()].

Let 𝐂𝐏q,𝖺𝗍𝗄=max𝐂𝐏n() where the maximum is taken over all q-tuples of distinct messages having at most blocks each and satisfy 𝖺𝗍𝗄 (i.e., when 𝖺𝗍𝗄=𝖾𝗊, messages must have equal length, similarly when 𝖺𝗍𝗄=𝗉𝖿 no message is prefix to others, and finally 𝖺𝗍𝗄=𝖺𝗇𝗒 means no restriction other than length restriction). Following [2], we view EMAC as an instance of the Carter–Wegman paradigm [33]. This enables us to reduce the problem of bounding the prf-advantage of EMAC to bounding the collision probability as

(1)𝐀𝐝𝐯EMAC𝖺𝗇𝗒(q,)𝐂𝐏q,𝖺𝗇𝗒+q(q-1)2n+1.

Note that 𝐂𝐏q,𝖺𝗇𝗒(q2)𝐂𝐏2,𝖺𝗇𝗒 as the collision for q messages is the union of collision events for each of the (q2) pairs of messages. Bellare, Pietrzak and Rogaway [2] proved that

𝐂𝐏2,𝖺𝗇𝗒2d()2n+64422n,

where d()=maxd() and d() is the number of divisors of . In [34], Wigert showed that

d()=1/Θ(lnln)=o(1).

Using this bound of collision probability for a pair of messages, we see that the prf-advantage of EMAC is about O(d()q2/2n) for <2n/4. Later Pietrzak [31] provided an improved analysis of EMAC and proved that the PRF advantage of EMAC is about O(q2/2n) for <min{q1/2,2n/8}. We revisit this improved analysis later in Section 8. A related claim on 𝐂𝐏 is

𝐂𝐏2,𝖾𝗊=2-n+(d())22-2n+62-3n

(see [9]) which gives a tight bound for equal length messages.

4.2 PRF advantage of CBC

Now we revisit the security analysis of CBC-MAC construction. Let 𝖥𝖼𝗈𝗅𝗅π(M1;M2), called full collision, denote the event that

𝗂𝗇π(M2)[m2]=𝗂𝗇π(Mr)[j]for some (r,j)(2,m2).

In other words, if the full collision event does not hold, then the last intermediate input of π is “fresh” (not appeared before) while computing CBCπ(M2). So when π is replaced by a random permutation and this event does not hold, then the CBC-output should behave “almost” randomly. We use this intuition while we provide a bound of prf-advantage of CBC.

Remark 1

We would like to remark that in the original paper [2], the full collision event is defined through the intermediate outputs instead of inputs. Since we consider CBC based on permutation only, equalities among inputs and equalities among outputs are the same.

For a q-tuple of messages , the union of full collision events is similarly denoted by 𝖥𝖼𝗈𝗅𝗅π(). The probability of this event, called full collision probability, is denoted by 𝐅𝐂𝐏n(). The maximum full collision probability is denoted by 𝐅𝐂𝐏q,𝖺𝗍𝗄. Similar to inequality (1), the following result has been proved in [2]:

(2)𝐀𝐝𝐯CBC𝗉𝖿(q,)q2(𝐅𝐂𝐏2,𝗉𝖿+4/2n).

Note that we must restrict the adversary to make prefix-free queries, since otherwise it would be easy to distinguish CBC from a random function (using the classical length extension attack). Similarly, if M2 is a prefix of M1, it is easy to see that 𝐅𝐂𝐏n(M1,M2)=1, so the above result becomes meaningless. As before, we also state an equivalent form of PRF advantage of CBC in terms of full collision probability among q messages. The above inequality (2) would be again a straightforward application of the following result.

Proposition 2

We have

𝐀𝐝𝐯CBC𝗉𝖿(q,,σ)𝐅𝐂𝐏q,𝗉𝖿+2σq2n+q22n+1.

Proof.

Let 𝒯good:=((M1,T1),(M2,T2),,(Mq,Tq)) be the set of all pairs of =(M1,,Mq)(+)q and T=(T1,,Tq)q such that the Mi are distinct and the Ti are also distinct. Trivially, random function 𝖱𝖥 returns a collision pair on any q distinct queries with probability at most (q2)2-n for any adversary A. Thus,

𝖯𝗋[view(A𝖱𝖥)𝒯good]q22n+1.

Using the coefficient H-technique, now we only need to bound the relationship between the interpolation probabilities. We fix =(M1,,Mq)(+)q and T=(T1,,Tq)q such that the Mimi are distinct and the Ti are also distinct. Let mi for all i and write imi=mσ. Now, a permutation π is called bad if

  1. 𝖥𝖼𝗈𝗅𝗅π() holds, or

  2. 𝗈𝗎𝗍π(Mr)[i]=Tr for some r,r[q], i[mr].

All other permutations are called good. We define an equivalence relation on Perm as ππ if 𝗂𝗇π(Mr)=𝗂𝗇π(Mr) for all r. It is clearly an equivalence relation and a good permutation can only be related with another good permutation. Let 𝒞 be an equivalence class consisting of some good permutations. Let s be the number of distinct intermediate inputs for the computation of all CBCπ(Mr) where π𝒞. Note that s is the same for all π𝒞. Then, |𝒞|=(2n-s-q)! as the outputs of exactly (2n-s-q) inputs of π are determined. Since the Ti are not intermediate outputs, we have

|𝒞[𝖢𝖡𝖢ΠT]|=(2n-s)!

(since q additional restrictions on input-output are being added). So for any class of good permutations 𝒞,

𝖯𝗋[CBCΠTΠ𝒞]=(2n-s)!(2n-s-q)!2-nq.

Thus,

𝖯𝗋[CBCT]𝒞 is good𝖯𝗋[CBCTΠ𝒞]×𝖯𝗋[Π𝒞]𝖯𝗋[Π is good]×2-nq.

So it is sufficient to bound a random permutation being bad. Then we will be done by using the coefficient H-technique as stated in Theorem 4. By definition of full collision probability, the first condition for a permutation to be bad can happen with probability at most 𝐅𝐂𝐏q,𝗉𝖿. The second condition says that we sample at most m outputs of a random permutation and one of them belongs to the set {T1,,Tq}. This can happen with probability at most mq/(2n-m) which is further less than mq/2n-1 provided m<2n-1. Note that mσ. If m2n-1, then the above bound holds trivially. So the probability of bad permutation is bounded by 𝐅𝐂𝐏q,𝗉𝖿+mq/2n-1. After applying the coefficient-H technique, we have proved the result. ∎

Remark 3

Note that 𝐅𝐂𝐏q,𝗉𝖿q(q-1)𝐅𝐂𝐏2,𝖺𝗇𝗒 by considering all ordered pairs (Mi,Mj). This also proves the original claim from [2] as stated in inequality (2). In fact, it is potentially a better bound than the original as it uses the total number of blocks σ instead of q. In [2], it is proved that

𝐅𝐂𝐏2,𝗉𝖿82n+64422n.

In Section 7 we revisit the above bound. In particular, we revise the proof in light of the flaw in [2, Lemma 10] and get an increment in the multiplication factor. Moreover, our revised bound of 𝐅𝐂𝐏q,𝗉𝖿 would be in the order σq/2n instead of q2/2n (whenever 2n/3). So our analysis rectifies the previous proof and also provides a better bound in some cases (e.g., average message length is much smaller than the length of longest messages which may occur when message lengths are very skewed).

5 Revisiting structure graph

In the previous section we have seen how the PRF advantage of CBC or EMAC is essentially reduced to bound some collision events of internal inputs or outputs of the underlying permutation. Thus, it would be useful to have an object which deals with the intermediate inputs and outputs. The structure graph does so and it has been used to bound the (full) collision probabilities in [2]. In this section we revisit the structure graph and show that one of the main claims in [2] (namely, [2, Lemma 10]) about structure graphs is false.

Notation and conventions for this section.

Let us fix a tuple of messages =(M1,,Mq) throughout this section where Mimi, and let m:=i=1qmi and maximi=.

5.1 Intermediate inputs and outputs

Index set.

We first collect all intermediate inputs and outputs which are obtained through the computation of CBCπ(Mr) for all r. These intermediate values will be defined as a sequence over a two-dimensional index set. Each index is a pair where the first element of the pair corresponds to the message number and the second element is the block number of that message. More formally, we define the index set

={(r,i):r[q],i[mr]}

and the dictionary order on it as follows: (r,i)(r,i) if r<r or r=r and i<i. Let x be a sequence over this index set. For any r[q], we denote the subsequence (x[r,1],,x[r,mr]) by x[r,*]. Sometimes we also consider the index set 0={(r,0):r[q]}, and the natural extension of the order on 0.

Sequences for intermediate inputs and outputs.

We denote the sequences of intermediate outputs and inputs over the index set as 𝗈𝗎𝗍π() and 𝗂𝗇π(), respectively, where

𝗈𝗎𝗍π()[r,*]=𝗈𝗎𝗍π(Mr),𝗂𝗇π()[r,*]=𝗂𝗇π(Mr)for all r[q].

For a single message, we have seen before that the intermediate input sequence is uniquely determined by the intermediate output sequence and we denote the association by a function 𝗈𝗎𝗍𝟤𝗂𝗇. The same is true for q messages and we extend this definition as follows: Given any block sequence y over the index set , we define 𝗈𝗎𝗍𝟤𝗂𝗇(y) as a block sequence x over the same index space where x[r,*]=𝗈𝗎𝗍𝟤𝗂𝗇Mr(y[r,*]), r[q]. Thus, for any π, we have 𝗈𝗎𝗍𝟤𝗂𝗇(𝗈𝗎𝗍π)=𝗂𝗇π.

5.2 Structure graphs and block-vertex structure graphs

A block-vertex structure graph is a graph theoretical representation of intermediate output 𝗈𝗎𝗍π. The block-vertex structure graph𝖡𝗌𝗍𝗋𝗎𝖼𝗍π for a permutation π is defined by the set of labeled edges

E:=r=1q{(𝗈𝗎𝗍π[r,i-1],𝗈𝗎𝗍π[r,i];Mr[i]):i[mr]}.

Clearly, G is a union of Mi-walks for all i[q], and vertex 0nV has positive out-degree. Let 𝖡𝗌𝗍𝗋𝗎𝖼𝗍() denote the set of all block structure graphs for the tuple of messages . Note that as explained below,

v𝐴wπ(vA)=w.

So, for every vV, all outward edges (similarly for inward edges) have distinct edge labels. Using this property, it is easy to see that the walks are unique and we denote them by wMi or simply wi whenever the message tuple is understood. See Figure 4 for a single message (i.e., q=1) in which the input and output vectors are stored in a directed graph.

While storing the intermediate sequences as a set of labeled edges, we may loose the order as well as the repetition of the elements. Interestingly, we see that we can uniquely reconstruct the intermediate sequences from such an edge-labeled graph by using uniqueness of Mi-walks. More precisely, 𝗈𝗎𝗍π[r,i]=wr[i].

Let G=(V,E) be a labeled directed graph and f:VV* a bijective function. Then one can define a labeled directed graph G*=(V*,E*) isomorphic to G for which f is an isomorphism. More precisely, ((u,v);a)E if and only if ((f(u),f(v));a)E*. When f is an injective function, we can view the function where the range set is the image set of the function and this makes the function bijective. We call the graph obtained as described above a transformed G with respect to f.

Figure 4 Let M1=(1,0,0,0,0)$M_{1}=(1,0,0,0,0)$ and π⁢(1)=2$\pi(1)=2$, π⁢(2)=3$\pi(2)=3$, π⁢(3)=2$\pi(3)=2$. For any such π, we have 𝗈𝗎𝗍π=(2,3,2,3,2)$\mathsf{out}^{\pi}=(2,3,2,3,2)$ and𝗂𝗇π=(1,2,3,2,3)$\mathsf{in}^{\pi}=(1,2,3,2,3)$. However, the graph consists of three vertices {0,2,3}$\{0,2,3\}$ and edge set E={(0,2),(2,3),(3,2)}$E=\{(0,2),(2,3),(3,2)\}$with labels 1, 0 and 0, respectively. We see that the intermediate input and output sequences actually can bereconstructed from this labeled structure graph. The walk corresponding to the message M1$M_{1}$ will uniquely identifythe output vector as 𝗈𝗎𝗍π=(2,3,2,3,2)${\mathsf{out}^{\pi}=(2,3,2,3,2)}$, and the input vector 𝗂𝗇π=(1,2,3,2,3)${\mathsf{in}^{\pi}=(1,2,3,2,3)}$ can be constructed using therelation between input, output and message.
Figure 4

Let M1=(1,0,0,0,0) and π(1)=2, π(2)=3, π(3)=2. For any such π, we have 𝗈𝗎𝗍π=(2,3,2,3,2) and𝗂𝗇π=(1,2,3,2,3). However, the graph consists of three vertices {0,2,3} and edge set E={(0,2),(2,3),(3,2)}with labels 1, 0 and 0, respectively. We see that the intermediate input and output sequences actually can bereconstructed from this labeled structure graph. The walk corresponding to the message M1 will uniquely identifythe output vector as 𝗈𝗎𝗍π=(2,3,2,3,2), and the input vector 𝗂𝗇π=(1,2,3,2,3) can be constructed using therelation between input, output and message.

Definition 1

For every vertex v of a block-vertex structure graph G=(V,E), we define a mapping α:V as αv=α(v)=(r,i) where (r,i) is the minimum index such that wr[i]=v. Clearly, it is an injective mapping with an image set, say V*. The structure graph G*=(V*,E*) associated to π is the α-transformed block-vertex structure graph.

Figure 5 Structure graph corresponding to the labeled structure graph.
Figure 5

Structure graph corresponding to the labeled structure graph.

Example 2

Let

M1=(M1[1],M1[2],M1[3],M1[2],M1[4]),M2=(M2[1])

be two messages, and for πPerm let

𝗂𝗇π[1,*]=(Y0M1[1],Y1M1[2],Y2M1[3],Y1M1[2],Y2M1[4]),
𝗈𝗎𝗍π[1,*]=(Y1,Y2,Y1,Y2,Y3),𝗂𝗇π[2,*]=(Y0M2[1]),𝗈𝗎𝗍π[2,*]=(Y3).

The corresponding block labeled structure graph 𝖡𝗌𝗍𝗋𝗎𝖼𝗍π is as shown in Figure 5 (a). Following the above steps, we arrive at a valid structure graph 𝗌𝗍𝗋𝗎𝖼𝗍π in Figure 5 (b).

Let wr* denote the Mr-walk in G*. It is easy to see that a structure graph is again a union of Mr-walks wr* starting from 0.[2] A structure graph is called a zero-output graph if 0 has positive in-degree, otherwise we call it non-zero output graph. To express it mathematically, we define a binary function Iszero such that for each zero-output graph G*, Iszero(G*)=1, otherwise it maps to 0.

To reconstruct a block-vertex structure graph realizing G* we have to find labels from for all the vertices in a “consistent manner”, and we call such a labeling valid. Basically, we need to find an injective mapping α-1:V* such that image set of α-1 is V and α:=(α-1)-1 is an isomorphism.

Definition 3

An injective function Y:V* is called valid block label for a structure graph 𝒮=(V*,E*) if the graph G=(V,E) is a block-vertex structure graph where

  1. V={0n}{Yi:=Y(i):iV*} and

  2. E is the edge set after relabeling i by Yi (we assume Y0:=0n).

Necessary condition of valid labeling function Y.

Now we try to find necessary conditions of a valid labeling. First of all, by definition, Yi should be all distinct as the valid block label is injective (distinct vertex should get distinct block label). In addition to this, whenever e1:=(u,z),e2:=(v,z)E we must have Yu(e1)=Yv(e2) as these are input for the vertex z. An input-collision or simply a collision of a graph G is defined by such a triple δ=(u,v;z). The set {u,v} is called the source of the collision whereas z is called the head of the collision. We also say the edges e1 and e2 are colliding edges. Thus, an input-collision δ=(u,v;z) induces a linear restriction Lδ:YuYv=cδ where cδ=(u,z)(v,z). Thus, a valid block label must satisfy the above condition for all collisions δ. Let ΔG* denote the set of all collisions of G*. Let rank(G*) denote the rank of all linear equations {Lδ:δΔG*}. The accident of a structure graph is defined depending on whether the graph is zero-output or not.

Definition 4

We define the accident of a structure graph G* as 𝐀𝐜𝐜(G*):=rank(G*)+Iszero(G*). Thus, the accident of a non-zero structure graph G* is defined to be rank(G*), whereas the accident of a zero-output graph is rank(G*)+1.

Lemma 5

If there is a vertex v with in-degree d, then rank(G*)d-1. Moreover, if the graph is a zero-output graph, then Acc(G*)d.

Proof.

Let v1,,vd be all predecessors of v. Let us define an input-collision δi,j:=(vi,vj;v). It is now easy to see that Lδi,j=L1,iL1,j. Moreover, the L1,i are linearly independent. Thus, the first part is proved. The second part is also trivial from the first part and the definition of the accident. ∎

Remark 6

Another simple but useful observation is as follows: if a structure graph G* has at least two collisions with different source, then 𝗋𝖺𝗇𝗄(G*)2.

Let S=(V*,E*) be a structure graph with rank r and |V*|=s+1. Then from linear algebra we know that some s-r choices of Yi values will uniquely determine the rest, and so the number of valid block labelings is at most 𝐏(2n,s-r). Any valid choice of Y induces a block-vertex structure graph G=(V,E) such that G*=S. Note that s+Iszero(G) is the number of vertices vV with positive in-degree. So exactly (2n-s-Iszero(G))! number of permutations can result in a block-vertex structure graph G. Therefore,

𝖯𝗋[𝖡𝗌𝗍𝗋𝗎𝖼𝗍Π=G]=(2n-(s+Iszero(G)))!2n!=1𝐏(2n,s+Iszero(G)).

So

𝖯𝗋[𝗌𝗍𝗋𝗎𝖼𝗍Π=S]=G:G*=S𝖯𝗋[𝖡𝗌𝗍𝗋𝗎𝖼𝗍Π=G].

Here the sum is taken over all block-vertex structure graphs G such that the induced structure graph G*=S. As there are at most 𝐏(2n,s-r) many vertex-label structure graphs (by bounding the number of valid block label functions as described above and using s+1m), we proved the following important result.

Lemma 7

For any structure graph S with accident a, we have

𝖯𝗋[𝗌𝗍𝗋𝗎𝖼𝗍Π=S]1(2n-m)a.

Now we state another important result which bounds the number of structure graphs with accident a. The proof of this result can be found in [2, 31]. So we skip the proof here.

Lemma 8

The number of structures graphs associated to M=(M1,,Mq) with accident a is at most (m2)a. In particular, there exists exactly one structure graph with accident 0.

Corollary 9

Let a1 be an integer. Then,

𝖯𝗋[𝐀𝐜𝐜(𝗌𝗍𝗋𝗎𝖼𝗍Π)a:Π$Perm](m22n)a.

This can be shown by making a straightforward algebraic simplification after applying Lemma 7 and Lemma 8. So we skip the proof.

5.3 True collision and an observation on [2, Lemma 10]

The definition of the accident is not obvious by looking at the structure graph. It would be good to have some transparent definition for a structure graph. True collision is such a metric. Let G* be a structure graph and wi* the Mi-walks. Suppose we reconstruct the graph G* again by making all the walks wi* for i=1 to q. While we walk along wi* for all i, we count how many times we reach an existing vertex which increases its current in-degree. The total count is defined to be the number of true collisions of the graph. Mathematically, one can define it as follows: For a vertex vV*{0}, we define the number of true collisions at v by 𝐓𝐂(v):=|𝗇𝖻𝖽(*v)|-1 and 𝐓𝐂(0)=|𝗇𝖻𝖽(*0)|. So the above count is actually the sum 𝐓𝐂(G*):=vV*𝐓𝐂(v). By Lemma 5 we know that 𝐀𝐜𝐜(G*)𝐓𝐂(v) for all vV*. From the definition of the accident it is also obvious that 𝐀𝐜𝐜(G*)𝐓𝐂(G*).

Lemma 10 of [2].

To identify all structure graphs with accident 1 it would be good if we have some relationship between true collision and accident. Lemma 10 of [2] was meant for this. It says that when q=2, 𝐀𝐜𝐜(G*)=1𝐓𝐂(G*)=1. This lemma is wrong due to the counter-examples given in Figure 6. The lemma has been used to bound the PRF advantage of CBC [2] and EMAC [31, 2]. As this becomes wrong, it would be very important to look back the proof and rectify the results as much as possible.

Figure 6 The counter-examples. (a) M1=(M1⁢[1],M1⁢[2],M1⁢[3],M1⁢[2],M1⁢[4])$M_{1}=(M_{1}[1],M_{1}[2],M_{1}[3],M_{1}[2],M_{1}[4])$ and M2=(M2⁢[1])$M_{2}=(M_{2}[1])$ are two messages such that M2⁢[1]:=M1⁢[1]⊕M1⁢[3]⊕M1⁢[4]$M_{2}[1]:=M_{1}[1]\oplus M_{1}[3]\oplus M_{1}[4]$. Here we have two input-collisions δ1:=((1,0),(1,2);(1,1))$\delta_{1}:=((1,0),(1,2);(1,1))$ and δ2:=((1,0),(1,2);(1,5))$\delta_{2}:=((1,0),(1,2);(1,5))$. The two linear equations Lδ1$L_{\delta_{1}}$ and Lδ2$L_{\delta_{2}}$ corresponding to the two input-collisions are the same as Y(1,0)⊕Y(1,2)=M1⁢[1]⊕M1⁢[3]$Y_{(1,0)}\oplus Y_{(1,2)}=M_{1}[1]\oplus M_{1}[3]$ and so the rank (which is also the accident in this case) is one. However, the true collision is two (at (1,1)$(1,1)$ and (1,5)$(1,5)$) which contradicts [2, Lemma 10]. Similar arguments can be given for figure (b), where M1=(M1⁢[1],M1⁢[2],M1⁢[3])$M_{1}=(M_{1}[1],M_{1}[2],M_{1}[3])$ and M2=(M2⁢[1])$M_{2}=(M_{2}[1])$, such that M2⁢[1]:=M1⁢[1]⊕M1⁢[2]⊕M1⁢[3]$M_{2}[1]:=M_{1}[1]\oplus M_{1}[2]\oplus M_{1}[3]$.
Figure 6

The counter-examples. (a) M1=(M1[1],M1[2],M1[3],M1[2],M1[4]) and M2=(M2[1]) are two messages such that M2[1]:=M1[1]M1[3]M1[4]. Here we have two input-collisions δ1:=((1,0),(1,2);(1,1)) and δ2:=((1,0),(1,2);(1,5)). The two linear equations Lδ1 and Lδ2 corresponding to the two input-collisions are the same as Y(1,0)Y(1,2)=M1[1]M1[3] and so the rank (which is also the accident in this case) is one. However, the true collision is two (at (1,1) and (1,5)) which contradicts [2, Lemma 10]. Similar arguments can be given for figure (b), where M1=(M1[1],M1[2],M1[3]) and M2=(M2[1]), such that M2[1]:=M1[1]M1[2]M1[3].

6 Characterization of accident-one structure graphs

In this section we characterize all structure graphs with accident 0 or 1. We have already seen that the authors of [2] have missed some structure graphs for two messages. Thus, it is important to see whether there are other such graphs or not. To do so we characterize single message structure graphs which is much easier to convince. Later in this section we characterize all structure graphs for a pair of messages satisfying some event. Note that from here onwards we will not deal with the block-vertex structure graph. So for simplicity from here onwards we will use G (instead of G*) to represent a structure graph and wr (instead of wr*) to represent the Mr-walk in the structure graph.

Let 𝗌𝗍𝗋𝗎𝖼𝗍a()={G𝗌𝗍𝗋𝗎𝖼𝗍():𝐀𝐜𝐜(G)=a}, the set of all structure graphs associated to with accident a. In particular, we are interested in 𝗌𝗍𝗋𝗎𝖼𝗍0() and 𝗌𝗍𝗋𝗎𝖼𝗍1(), the sets of all structure graphs with accident 0 and 1, respectively. Lemma 8 says that the number of graphs with accident 1 is at most (m2) where m=imi and Mimi. The number of structure graphs with accident 0 is at most one. In the following we actually identify a structure graph and hence it is unique. We call it the free graph associated to .

Free graphs.

As there is no accident, every non-zero vertex has in-degree 1, and 0 has in-degree 0 (i.e., non-zero output graph). Being a structure graph, G is a union of Mi-walks wMi. An Mi-walk starting from 0 with no vertex having in-degree 2 must be a path. So G is a union of Mi-paths wMi. Now for any ij, let p=𝖫𝖢𝖯(Mi;Mj). Then, wi[1..p]=wj[1..p] and wi[p+1]wj[p+1] (if these are defined). It is also easy to see that wi[1..p], wi[p+1..mi], and wj[p+1..mj] are disjoint paths. Thus, any two paths wi and wj are the same up to the length of the largest common prefix of Mi and Mj and afterwards they remain disjoint. We call this unique graph free graph. A free graph for three messages is illustrated in Figure 7.

Figure 7 Free structure graph for three messages.
Figure 7

Free structure graph for three messages.

Figure 8 Characterizing all accident-one structure graphs realizable by a single message. The dashed lines in theseillustrations represent optional subwalks. Here the vertex w⁢[i]$w[i]$ is represented by i, for notational simplicity.
Figure 8

Characterizing all accident-one structure graphs realizable by a single message. The dashed lines in theseillustrations represent optional subwalks. Here the vertex w[i] is represented by i, for notational simplicity.

6.1 Accident one for a single message

Now we consider the structure graph for a single message M+. Note that any such structure graph must be a walk w of length m. We say a node w[i] is fresh in the walk if w[i]w[j] for all ji.

Case A: 0 has positive in-degree.

As 0 has positive in-degree, there can not be any more collision pairs, otherwise the accident would be at least two. Let c be the minimum positive integer such that w[c]=0, so we have a cycle (w[0],w[1],,w[c]). Let X be its label. Suppose M=XiY where i is the maximum positive integer for which we can write M in this form. So X is not a prefix of Y. Let s=𝖫𝖢𝖯(X;Y). Thus, w[ic+j]=w[j] for all j[0..s].

  1. If Y is a prefix of X, then the structure graph is a cycle of size c ending at w[s]. It is illustrated in Figure 8 (a) where the * is empty.

  2. If Y is not a prefix of X, then w[ic+s]=w[s] and w[ic+s+1]w[s+1]. Further, w[ic+s+1]w[j] for all j[c] since otherwise we get a collision. In fact, it can be shown that all subsequent nodes are fresh. Suppose not, then let j>ic+s+1 be the first such integer for which w[j]=w[k] for some k<j, hence we obtain a collision. So the structure graph is an edge disjoint union of a cycle of size c and a path starting from s, as illustrated in Figure 8 (a). The length of the cycle is c, whereas the length of the path is m-ic-s. We also call this graph ρ graph. The tail (path from 0 to the cycle) of the ρ walk is empty.

Case B: 0 has in-degree 0.

As 0 has in-degree 0, there is a collision δ=(u0,v0;z). In fact, all other collisions must have the same source as that of δ.

Consider the M-walk (w[0],w[1],) which is clearly not a path. Let (i0,j0) be the smallest positive distinct integers such that w[i0]=w[j0].[3] As 0 has in-degree 0, so 1i0<j0 and we can assume that w[i0-1]=u0 and w[j0-1]=v0. Now, as in Case A, let A=(w[0..i0]), X=(w[i0..j0]), j0-i0=c. Then, AX is the prefix of M. Let t be the largest positive integer such that M=AXtY. So X is not a prefix of Y. If Y is a prefix of X, then we have a structure graph as illustrated in Figure 8 (d) and (f) (the end point lies inside the cycle). Suppose Y is not a prefix and let s=𝖫𝖢𝖯(X;Y).

Claim

The walk after AXtY[1..s] is a path and disjoint from the rest; illustrated in Figure 8(c).

Proof.

Suppose there exists vw[s]w[tc+s..m]w[1..tc+s]. We distinguish the following cases.

Case B.1: w[tc+s+1]=w[i], i[tc+s]. If sj0-1, then we have a new collision δ=(w[i-1],w[tc+s];w[i]) independent of δ which increases the accident to 2. If s=j0-1, then ii0 as X[s+1]Y[1]. Now the only way to make δ dependent on δ is to have i-1=i0-1. This implies a collision at w[j] where j[1..i0-1], as the walk must come back to i0-1 at the (i-1)-th step. This again gives a new accident.

Case B.2: w[tc+s+1]w[1..tc+s] and w[j]=w[i], i[tc+s], j[tc+s+2..m]. So, there is a new collision δ=(w[j-1],w[i-1];w[i]) which is independent of δ. This gives a new accident. Thus, we have w[tc+s+1..m]w[1..tc+s]=.

Case B.3: w[tc+s..m] is not a path. Therefore there exist i,j[tc+s..m] such that (w[i],w[j];w[i+1]) is a collision. Clearly this will be independent from δ and hence gives a new accident. So none of the cases 1, 2 or 3 is possible. ∎

Observe that s=j0-1 is a special case. In addition to this condition, suppose we have an edge e:=(w[i0-1],w[tc+s+1]) which creates a collision δ=(w[i0-1],w[j0-1];w[tc+s+1]) dependent on δ. The edge e cannot occur in a single message graph, as that will imply 𝗇𝖻𝖽(*w[j])2 for some j[0..i0-1] which gives a new accident. But for a two-message graph this is realizable (counter-examples) as illustrated in Figure 8 (b) and (e). We summarize our discussion in the following lemma.

Lemma 1

For m1, MBm and πPerm, the graphs in Figure 8 exhaust all possible forms for Gπ(M) when the accident is 1.

7 Revisiting 𝐂𝐏n(M1,M2) and 𝐅𝐂𝐏n(M1,M2) bounds

In this section our main aim is to revise the proofs of 𝐂𝐏 and 𝐅𝐂𝐏 bounds and consequently the PRF advantages in [2]. As mentioned earlier the motivation for this revision is our observation that one of the main tools [2, Lemma 10] in bounding |𝗌𝗍𝗋𝗎𝖼𝗍1[𝖼𝗈𝗅𝗅]| and |𝗌𝗍𝗋𝗎𝖼𝗍1[𝖥𝖼𝗈𝗅𝗅]| is false.

We start off with a discussion that establishes the role of structure graphs in the PRF security analysis of CBC-MAC and EMAC. Note that we have already seen that bounding PRF advantages of CBC-MAC and EMAC is reduced to bounding full collision probability 𝐅𝐂𝐏2,𝗉𝖿 and collision probability 𝐂𝐏2,𝖺𝗇𝗒, respectively. So it would be sufficient to bound these probabilities. For this we first prove a general claim (Proposition 1).

Structure graph events.

Let =(M1,,Mq) be a tuple of q messages. Let E be an event defined on the intermediate output sequence 𝗈𝗎𝗍π() for a permutation π. We say that the event E is defined by a structure graph if there is an event E defined on the structure graph 𝗌𝗍𝗋𝗎𝖼𝗍π such that E holds if and only if E holds. We call such an event a structure graph event. Moreover, we say that E is non-free if it is false for the free structure graph (the structure graph with accident 0). Note the collision event for any distinct messages as well as the full collision event for prefix-free messages are examples of non-free structure graph events. In consistency with our notation, we denote by 𝗌𝗍𝗋𝗎𝖼𝗍a(E) the set of all structure graphs with accident a and satisfying a non-free event E.

Proposition 1

Let E be a non-free structure graph event for the message tuple M. Then,

𝖯𝗋Π[E]|𝗌𝗍𝗋𝗎𝖼𝗍1[E]|2n-m+m422n.

Proof.

Note that for any structure graph event E,

𝖯𝗋Π[E]=a0𝖯𝗋[𝗌𝗍𝗋𝗎𝖼𝗍Π𝗌𝗍𝗋𝗎𝖼𝗍a[E]].

As the event is non-free, the sum can be done for a1. Moreover, we know that

𝖯𝗋[𝐀𝐜𝐜(𝗌𝗍𝗋𝗎𝖼𝗍Π)2]m422n.

So the result follows from Lemma 7 which bounds the probability of realizing a structure graph with accident a. ∎

7.1 Revisiting the 𝐂𝐏2, bound

Suppose M1m1 and M2m2 such that M1[m1]M2[m2], 0m1m2, since otherwise we can remove the largest common suffix which does not change the collision probability. Note that the first message M1 now can be empty (then M2 is not, as they are distinct) and in this case collision event means that 𝗈𝗎𝗍Π(M2)[m2]=0n. This is a structure graph event because 0 is a vertex of the structure graph. Due to Proposition 1, we only need to bound the number of structure graphs with accident 1 satisfying the 𝖼𝗈𝗅𝗅 event for the pair of messages. More precisely, we have to bound the size of the set 𝗌𝗍𝗋𝗎𝖼𝗍1(M1,M2)[𝖼𝗈𝗅𝗅].

Case 1: M1 is an empty message.

In this case, we have

𝗌𝗍𝗋𝗎𝖼𝗍1(M1,M2)[𝖼𝗈𝗅𝗅]=𝗌𝗍𝗋𝗎𝖼𝗍1(M2)[wM2[m2]=0].

Now, we make the following claim which is essentially [2, Lemma 14]:

Claim

|𝗌𝗍𝗋𝗎𝖼𝗍1(M2)[w2[m2]=0]|d(m2).

Proof.

Let x be the smallest positive integer such that w2[x]=0. Let X be the label of the walk w2[0..x]. If M2=Xd with some positive integer d, then 𝗌𝗍𝗋𝗎𝖼𝗍1(M2)[w2[x]=0] contains exactly one structure graph. Note that x must divide m2 and hence the number of possible choices of such x is at most d(m2), the number of divisors of m2. Suppose M2=XdY for some non-empty Y where d is the largest such integer of this form. If Y is a prefix, then W2[m2] is the point in the cycle and it must be 0. This can be zero only if Y=X which contradicts the maximality of d. So now assume that Y=Y1Y2 such that Y1 is the largest common prefix of X and Y, and Y2 is some non-empty string. If s is the length of Y1, then Y2[1]Y[s+1]. Thus, w2[dx+s+1]w2[s+1]. As it is a zero-output structure graph, we can not have any collision. So there is no way to obtain w2[m2]=0. This proves the claim. ∎

Case 2: M1 is not an empty message.

In this case, we have a collision

(u:=w1[m1-1],v:=w2[m2-1],z:=w2[m2])

as the labels of the last edges for walks w1 and w2 are different. Any other collision, if any, must have the same source set {u,v}. Moreover, 0 can not have positive in-degree. Now we consider different sub-cases:

Case 2.1: Both w1 and w2 are paths.

In this case, the union of w1[1..m1-1] and w2[1..m2-1] is a free graph (as w1[m1-1] and w2[m2-1] can not appear before in the graph and so no collision among the path can occur). This gives only one choice of the graph as shown in Figure 9 (a). So the number of choices is bounded by at most 1. This is proved as part of the incorrect lemma [2, Lemma 15].

Case 2.2: w2 is not a path.

Then we have already characterized all possibilities of w2. So there exist some integers t,c such that w2[1..t] is a path with w2[t-1]=u and w2[t]=p, w2[t..t+c] is a cycle of length c such that w2[t+c-1]=v. (Note that w2[t-1]w2[m2-1].) Now, w1[m1-1]=u.

Claim

w1[1..m1-1]=w2[1..t-1] and so m1=t.

Proof.

Let s be the length of the largest common prefix of w1[1..m1-1] and w2[1..t-1]. If s<t-1, then in the walk w1 there is no way to reach u without coming back to the walk w2[1..t-1]. Coming back is not possible as it leads to a collision with a different generator set. Similarly we can disprove that s=t-1 and m1>t. Thus, we have m1=t and w1[1..m1-1]=w2[1..t-1]. ∎

Now, we distinguish two cases for the choices of p=𝖫𝖢𝖯(M1;M2).

Case 2.2.1 a. If w1[p]=z, then we have the structure graph as illustrated in Figure 9 (b). In this case, M1 is a prefix of M2. The number of such structure graphs is again at most d(m2-m1) (similar to the previous case where M1 is the empty message). This is also [2, Lemma 13].

Case 2.2.1 b. If w1[p]z. Then we get a case which was not considered in [2]. In this case, w1[p] should be a fresh node, otherwise we get a collision with different source set. Thus, we get a structure graph which is shown in Figure 9 (c). Let M1=Aa where A=M1[1..t-1] and a=M1[t]. Note that t-1 is the length of the largest common prefix of M1 and M2. Then,

M2=Ab(Xx)d-1Xc,where c=M2[m2],b=M2[t],x=abc.

The choice of X is variable. But it must satisfy the above for some d>1. In fact, X is determined by its length which is c. Again, c must divide m2-m1 and hence the number of choices of c is at most d(m2-m1)-1.

This completes the characterization of all structure graphs satisfying 𝖼𝗈𝗅𝗅 with accident 1 and bounds the number of such graphs for all cases. Note that Cases 2.2.1 a and 2.2.1 b cannot hold simultaneously. But, Cases 2.2.1 b and 2.1 can hold simultaneously which makes the total count of these two cases at most d(m2-m1). Since the order of messages does not matter in 𝖼𝗈𝗅𝗅, we are done.

Lemma 2

Let M1Bm1, M2Bm2.

  1. If M1<1M2, then 𝗌𝗍𝗋𝗎𝖼𝗍1(M1,M2)[𝖼𝗈𝗅𝗅] is of the form illustrated in Figure 9(b) and the number of such graphs is at most d(m2).

  2. If M1<2M2, then 𝗌𝗍𝗋𝗎𝖼𝗍1(M1,M2)[𝖼𝗈𝗅𝗅] is of the form illustrated in Figure 9(c) and the number of such graphs is at most d(m2).

  3. In all other cases, 𝗌𝗍𝗋𝗎𝖼𝗍1(M1,M2)[𝖼𝗈𝗅𝗅] is of the form illustrated in Figure 9(a) and the number of such graphs is at most one.

Figure 9 Characterizing all accident-one structure graphs realizable by two messages which satisfy the coll$\mathsf{coll}$ event.Dashed lines represent w1$w_{1}$ and solid lines represent w2$w_{2}$.
Figure 9

Characterizing all accident-one structure graphs realizable by two messages which satisfy the coll event.Dashed lines represent w1 and solid lines represent w2.

Corollary 3

We have |struct1(M1,M2)[coll]|d(m2) for any distinct messages M1, M2 with m1m2. Thus,

𝐂𝐏2,𝖺𝗇𝗒d()2n-2+16422n.

7.2 Revision of 𝐅𝐂𝐏2,𝗉𝖿 bound

Since 𝖥𝖼𝗈𝗅𝗅 is a non-free structure graph event, we have, using Proposition 1,

𝐅𝐂𝐏n𝗉𝖿(M1,M2)|𝗌𝗍𝗋𝗎𝖼𝗍1(𝖥𝖼𝗈𝗅𝗅)|2n-m1-m2+(m1+m2)422n.

Thus, it would be again sufficient to bound the number of structure graphs for two messages with accident 1 satisfying full collision property. Bellare, Pietrzak and Rogaway [2] proved |𝗌𝗍𝗋𝗎𝖼𝗍1(𝖥𝖼𝗈𝗅𝗅)|4max{m1,m2}. While bounding |𝗌𝗍𝗋𝗎𝖼𝗍1(𝖥𝖼𝗈𝗅𝗅)|, they proved a strong result [2, Lemma 19] that will be also useful in our analysis. We reproduce it here in our notations.

Lemma 4

For b{1,2} and any i[0..mb],

|𝗌𝗍𝗋𝗎𝖼𝗍1(M1,M2)[wb[i]wb[0..i-1,i+1..mb]]|mb.

Since the proof of Lemma 4 can be found in [2], we skip it here. Now, we revise the 𝐅𝐂𝐏 bound to |𝗌𝗍𝗋𝗎𝖼𝗍1(𝖥𝖼𝗈𝗅𝗅)|3(m1+m2) and the new bound is as follows.

Lemma 5

We have

𝐅𝐂𝐏n𝗉𝖿(M1,M2)3(m1+m2)2n-m1-m2+(m1+m2)422n.

Proof.

We need to bound the number of structure graphs for a pair of prefix-free messages M1m1 and M2m2 which satisfy the 𝖥𝖼𝗈𝗅𝗅 event and have at most accident 1. Note that the event implies that the structure graphs must have at least accident 1 as the messages are prefix-free. The event 𝖥𝖼𝗈𝗅𝗅 can be written as w2[m2]w2[0..m2-1]w2[m2]w1[1..m1].

Case 1: w2[m2]w2[0..m2-1].

This case can be bounded to at most m2, by direct application of Lemma 4.

Case 2: w2[m2]w1[1..m1].

Suppose 𝖥𝖼𝗈𝗅𝗅(M1;M2) happens due to w2[m2]=w1[r] for an arbitrary r[1..m1-1]. Then 𝖥𝖼𝗈𝗅𝗅(M1;M2) is equivalent to 𝖼𝗈𝗅𝗅(M1[1..r],M2). For simplicity let M1:=M1[1..r]. Let s:=𝖫𝖢𝖲(M1;M2). Then M1[s-1]M2[m2-r+s-1]. Let

M1*=M1[1..s-1],M2*=M2[1..m2-r+s-1].

From Lemma 2 we know that G*𝗌𝗍𝗋𝗎𝖼𝗍1(M1*;M2*)[𝖼𝗈𝗅𝗅] must be one of (a), (b) or (c) in Figure 9. Note that G* is a subgraph of some G𝗌𝗍𝗋𝗎𝖼𝗍1(M1;M2)[𝖥𝖼𝗈𝗅𝗅].

Case 2.1: G* is as in Figure 9(a). In this case, w1* and w2* are paths. For a fixed r the only possible collision is at (w1*[s-2],w2*[m2-r+s-2];w1*[s-1]) and hence the number of such graphs is at most 1. There are at most m1 possible values for r. So, the number of choices for G𝗌𝗍𝗋𝗎𝖼𝗍1(M1;M2)[𝖥𝖼𝗈𝗅𝗅] is at most m1.

Case 2.2: G* is either as in Figure 9(b) or (c). In this case, at least one of w1* and w2* is not no path. Without loss of generality assume w1* is not a path. Let p*=𝖫𝖢𝖯(M1*;M2*). We know that M1*<1M1 and M2*<1M2. Thus M1[1..p*]=M2[1..p*]. Now we must have a collision (u,v;z) in w1*. From Lemma 2 we know that the graph can be either Figure 9 (b) or (c) depending on whether z=w1*[p*] or z=w1*[p*+1]. Next we make two claims which will enable us to bound the two cases. The proofs for these two claims are given later in the section.

Claim 1

If G* is Figure 9(b), then w1[LCP(M1;M2)] is not fresh in w1.

Claim 2

If G* is Figure 9(c), then w1[LCP(M1;M2)+1] is not fresh in w1.

Recall that in a walk w a vertex w[i] is not fresh if there exists ji such that w[j]=w[i]. By Claim 1 we know that w1[𝖫𝖢𝖯(M1;M2)] is not fresh when G* is as in Figure 9 (b). Similarly, by Claim 2 we know that w1[𝖫𝖢𝖯(M1;M2)+1] is not fresh when G* is as in Figure 9 (c). So using Lemma 4, we bound the number of such graphs G to at most m1+m1=2m1 when w1* is not a path. Similarly we have at most 2m2 choices when w2* is not a path. Therefore the total number of choices in Case 2.2 is at most 2(m1+m2). Combining Cases 1, 2.1 and 2.2, we have at most 3(m1+m2) choices. The result follows. ∎

Proof of Claim 1.

If G* is like Figure 9 (b), we must have z=w1*[p*]. Let q be the minimum index such that w1*[q]=w1*[p*]. Let P=(w1*[0..p*]) and X=(w1*[p*..q]), c=q-p*. Then M1*=PX and M2*=P. As M1* and M2* are formed by removing the largest common suffix from of M1 and M2, respectively, therefore

M1=(M1*Xi1Y)=(PXi1+1Y)andM2=(M2*Xi2Y)=(PXi2Y),

where i1,i20 are the largest such indices. Since M1 and M2 are prefix-free, we have i1+1>i2. Now M1=(M1Z)=(PXi1+1YZ), where |Z|0. From now onwards we will work on the walk w1 (instead of w1* which is a subwalk of w1) corresponding to M1. If Y is a prefix of X, then M2<1M1 which contradicts the prefix-free condition. So Y is not a prefix of X. If X is a prefix of Y, then it contradicts the maximality of i1,i2. So X is not a prefix of Y. Assume Y=Y1Y2 such that Y1 is the largest common prefix of X and Y, and Y2 is some non-empty string. If p is the length of Y1, then Y2[1]=Y[p+1]X[p+1]. Thus,

M1[1..i2c+p]=M2[1..i2c+p]andM1[i2c+p+1]M2[i2c+p+1].

So, p=𝖫𝖢𝖯(M1;M2). Further since i2<i1+1, w1[p] is traversed twice. Thus, w1[𝖫𝖢𝖯(M1;M2)] will not be fresh. Note that we started off with an arbitrary r. So w1[𝖫𝖢𝖯(M1;M2)] will not be fresh irrespective of the value of r. ∎

Proof of Claim 2.

If G* is like Figure 9 (c), we must have z=w1*[p*+1]. As noted earlier in the revision of the 𝐂𝐏 bound, this case was missing in the proof in [2]. Using a similar line of argument as in the previous case, we can conclude that irrespective of the value of r, the cycle goes through w1[𝖫𝖢𝖯(M1;M2)+1] twice. Thus, w1[𝖫𝖢𝖯(M1;M2)+1] is not fresh. ∎

Note that our approach in Case 2.2 above is a bit subtle. We used Lemma 2 to identify a fundamental property (cycle goes through p or p+1 twice) and then exploited this property to bound the counting. A straightforward approach of summing the counts for graphs in Figure 9 (b) and (c) over all values of r will give a worse bound of mbd(mb), b{1,2}. To get a tighter bound of mb we needed this subtlety. Now we extend the bound for 𝐅𝐂𝐏n𝗉𝖿(M1;M2) to 𝐅𝐂𝐏q,𝗉𝖿, in order to get the revised prf bound for CBC-MAC:

𝐅𝐂𝐏q,𝗉𝖿ij[q]𝐅𝐂𝐏n𝗉𝖿(Mi;Mj)
ij[q]3(mi+mj)2n-m1-m2+(mi+mj)422n
ij[q]6(mi+mj)2n+(mi+mj)422n
(3)12mq2n+16mq322n12σq2n+16σq322n.

Here we have computed the bound in terms of q, and σ. Another approach (as used in [2]) is to bound the value using q and only, in which case the bound will be

𝐅𝐂𝐏q,𝗉𝖿12q22n+164q222n.

Using Proposition 2 and (3), we get the following theorem.

Theorem 6

We have

𝐀𝐝𝐯CBC𝗉𝖿(q,,σ)14σq2n+16σq322n+q22n+1.

This gives a bound of O(σq/2n) for <2n/3. As noted earlier, this is a better bound whenever the average message length is much smaller than the length of the longest message.

8 Revised security analysis of EMAC

In this section we revisit the PRF analysis of EMAC due to Pietrzak [31]. We first identify the actual flaw in the proof and then provide a different proof to obtain, in fact, a better bound of EMAC (in terms of ). For notational simplicity we will keep our bounds in order notation and avoid the constant factors.

8.1 Flaw and revision of PRF advantage of EMAC

The proposed bound for EMAC as stated in [31] is

𝐀𝐝𝐯EMAC𝗉𝗋𝖿(q,,σ)=O(q22n(1+82n))

provided 2q. Thus, it becomes tight bound q2/2n when 𝗆𝗂𝗇(q1/2,2n/8). To show the above result we need to bound the collision probability 𝐂𝐏q,. One possible approach is to group the q message into O(q/2) groups, each group consists of about 2 messages. So the collision event among q messages implies that a collision occurs in two of the groups. Since 𝖼𝗈𝗅𝗅 is a non-free event, Proposition 1 gives

𝐂𝐏q,=O(|𝗌𝗍𝗋𝗎𝖼𝗍1()[𝖼𝗈𝗅𝗅]|2n)+O(q4422n).

Applying this with q=22 (i.e. for two groups), we have

𝐂𝐏q,=O(q24)×𝐂𝐏2,=O(q2N42n)+O(q2822n),

where N denote the number of accident-one structure graphs satisfying 𝖼𝗈𝗅𝗅 for 2 messages with maximum length . The O(q2/4) term is due to the number of ways in which we can choose two groups. In [31, Lemma 4], Pietrzak claimed that N=O(4). So, plugging this bound for , we have the desired bound. Now, to prove this bound for N, Pietrzak considered two cases for a pair of messages M and M (note that accident 1 and collision must occur for a pair of messages). More precisely, it can be shown that

(4)N=4maxM1M|𝗌𝗍𝗋𝗎𝖼𝗍1(M.M)[𝖼𝗈𝗅𝗅]|+4.

Recall that M1M means that they become prefix-free after removing the largest common suffix of M and M.

Claim

Claim ([31, Claim 1])

If M1M, then |struct1(M.M)[coll]|=1.

If this claim happens to be true, then N=O(4). However, we have seen before there exist M, M with M<2M (such that M1M) with |𝗌𝗍𝗋𝗎𝖼𝗍1(M,M)[𝖼𝗈𝗅𝗅]|=d(-1). Thus,

|𝗌𝗍𝗋𝗎𝖼𝗍1(M,M)[𝖼𝗈𝗅𝗅M1M]|=O(d()).

If we plug in this, we find the modified bound as N=O(4(d())2) and so the revised bound for the collision probability becomes O(q2d()/2n) which is not tight.

8.2 Simple proof of EMAC

We have seen in the last subsection that the influence of the flaw from [2, Lemma 10] is more serious having a tight bound of EMAC. So it is very crucial to revisit the security analysis of EMAC. One possible approach to fix the proof of [31] is to bound N in a different way. For example, we can consider two cases M<1M and M<2M (i.e., M[1..m-1]<1M2 but M1M). For any pair of messages which are not related by any one of these two relations, the number of structure graphs can be shown to be one. However, we need to show that the number of remaining graphs is still about 4 (see second term of (4)).

In this subsection we actually take a slightly different and, in fact simpler, approach. Instead of making groups of q messages, we directly bound the number of structure graphs for a slightly different choice of permutations. We will ignore all those permutations (i.e. bad permutations) which induce one of the following:

  1. For some pair of messages Mi and Mj the accident is two or more.

  2. For some message Mi, the accident is one.

Let ϕ be the property to represent the complement of the event. Let S be a structure graph associated to a q-tuple of messages. We recall that S is a union of q walks wi. We use the sub-graphs Si and Si,j to represent the walks wi and wiwj, respectively. Note that these are again structure graphs associated to Mi and (Mi,Mj), respectively. In this notation, ϕ is a property on all structure graphs S on such that 𝐀𝐜𝐜(Si,j)1 for all ij and 𝐀𝐜𝐜(Si)=0 for all i. We call a permutation good if its induced structure graph satisfies ϕ, otherwise we call it bad. Now we claim our new bound.

Lemma 1

We have

𝐂𝐏q,()O(q22n)+O(2q2n)+O(4q222n).

Proof.

We first bound the probability of bad random permutation. For a bad permutation, (i) there exist i and j such that the accident for the pair of messages Mi and Mj is at least 2, or (ii) there exists i such that the accident for Mi is at least one. The first event can happen with probability O(4q2/22n) by using Corollary 9. Similarly the second event can happen with O(2q/2n). Now we bound the probability p:=𝖯𝗋[𝖼𝗈𝗅𝗅ϕ]. Note that 𝖼𝗈𝗅𝗅 implies that there exist i and j such that the collision event holds for the message Mi and Mj. Now ϕ implies that the accident of Si,j is one whereas the accident of Si and the accident of Sj are zero. In Section 6 we have characterized all structure graphs for a pair of messages with accident 1 satisfying collision. Among all possibilities only one structure graph satisfies ϕ. Hence there is exactly one structure graph. This implies that 𝖯𝗋[𝖼𝗈𝗅𝗅(Mi,Mj)ϕ]=O(2-n). Hence, by summing over all possible i,j, we have

𝖯𝗋[𝖼𝗈𝗅𝗅()ϕ]=O(q22n).

The above discussion can be summarized as follows:

𝐂𝐏q,()=𝖯𝗋Π[𝖼𝗈𝗅𝗅Π()(𝗌𝗍𝗋𝗎𝖼𝗍Π()𝗌𝗍𝗋𝗎𝖼𝗍()[ϕ])]+𝖯𝗋[𝗌𝗍𝗋𝗎𝖼𝗍Π()𝗌𝗍𝗋𝗎𝖼𝗍Π()[ϕ]]
=ijO(|𝗌𝗍𝗋𝗎𝖼𝗍(Mi,Mj)[𝖼𝗈𝗅𝗅ϕ]|2n)+O(2q2n)+O(4q222n)
=O(q22n)+O(2q2n)+O(4q222n).

This completes the proof. ∎

Theorem 2

We have

𝐀𝐝𝐯𝖤𝖬𝖠𝖢𝖺𝗇𝗒(q,,σ)=O(2q22n+q222n+q2422n).

So if min{q1/2,2n/4}, then

𝐀𝐝𝐯𝖤𝖬𝖠𝖢𝖺𝗇𝗒(q,,σ)=O(q22n).

Note that our theorem gives a tight bound for a better constraint than what we had before in [31]. The condition q>2 can be dropped if we assume 2n/4-k for some small k such that 2-k is negligible. More precisely, if 2n/4-k, then the PRF advantage of EMAC is about q22n+12k.

9 Conclusion

In this paper we have revisited the PRF security analysis of CBC-MAC and EMAC. We made the revision as we have found that one of the main claims in the original papers providing improved bounds is not correct. This claim, in fact, influences some of the other claims. More importantly, the tight bound claim of EMAC becomes invalid even after a simple fix of the claim. So we feel that revision is essential and this paper serves this. Fortunately we have recovered the same bounds, at least in terms of the order, for both constructions. For CBC-MAC, we have attained the potentially better bound of O(σq/2n). Moreover, we have found a better way to analyze EMAC which provides a tight bound with a much relaxed constraint on message length . Namely our constraint is <2n/4 whereas the original constraint was <2n/8.


Communicated by Simon Blackburn


Acknowledgements

We have communicated with the authors of the papers [2, 31] and they have acknowledged our findings. We would like to thank them for giving their valuable time to go through our findings.

References

[1] Bellare M., Kilian J. and Rogaway P., The security of the cipher block chaining message authentication code, J. Comput. Syst. Sci. 61 (2000), 362–399. 10.1006/jcss.1999.1694Suche in Google Scholar

[2] Bellare M., Pietrzak K. and Rogaway P., Improved security analyses for CBC MACs, Advances in Cryptology – CRYPTO 2005, Lecture Notes in Comput. Sci. 3621, Springer, Berlin (2005), 527–545. 10.1007/11535218_32Suche in Google Scholar

[3] Bernstein D. J., A short proof of the unpredictability of cipher block chaining, preprint 2005, https://cr.yp.to/antiforgery/easycbc-20050109.pdf. Suche in Google Scholar

[4] Bosselaers A. and Preneel B., Integrity Primitives for Secure Information Systems. Final Report of Race Integrity Primitives, Lecture Notes in Comput. Sci. 1007, Springer, Berlin, 1995. 10.1007/3-540-60640-8Suche in Google Scholar

[5] Black J. and Rogaway P., A block-cipher mode of operation for parallelizable message authentication, Advances in Cryptology – EUROCRYPT 2002, Lecture Notes in Comput. Sci. 2332, Springer, Berlin (2002), 384–397. 10.1007/3-540-46035-7_25Suche in Google Scholar

[6] Black J. and Rogaway P., CBC MACs for arbitrary-length messages: The three-key constructions, J. Cryptology 18 (2005), 111–131. 10.1007/3-540-44598-6_12Suche in Google Scholar

[7] Datta N., Dutta A., Nandi M., Paul G. and Zhang L., One-key double-sum MAC with beyond-birthday security, preprint 2015, http://eprint.iacr.org/2015/958. Suche in Google Scholar

[8] Diffie W. and Hellman M. E., New directions in cryptography, IEEE Trans. Inform. Theory 22 (1976), 644–654. 10.1109/TIT.1976.1055638Suche in Google Scholar

[9] Dodis Y., Gennaro R., Håstad J., Krawczyk H. and Rabin T., Randomness extraction and key derivation using the CBC, cascade and HMAC modes, Advances in Cryptology – CRYPTO 2004, Lecture Notes in Comput. Sci. 3152, Springer, Berlin (2004), 494–510. 10.1007/978-3-540-28628-8_30Suche in Google Scholar

[10] Dutta A., Nandi M. and Paul G., One-key compression function based MAC with BBB security, preprint 2015, http://eprint.iacr.org/2015/1016. Suche in Google Scholar

[11] Dworkin M., Recommendation for block cipher modes of operation: The CMAC mode for authentication, preprint 2005, http://dx.doi.org/10.6028/NIST.SP.800-38B-2005. 10.6028/NIST.SP.800-38b-2005Suche in Google Scholar

[12] Ehrsam W. F., Meyer C. H. W., Smith J. L. and Tuchman W. L., Message verification and transmission error detection by block chaining, US Patent 4074066, 1976. Suche in Google Scholar

[13] Gaži P., Pietrzak K. and Rybár M., The exact PRF-security of NMAC and HMAC, Advances in Cryptology. Part I – CRYPTO 2014, Lecture Notes in Comput. Sci. 8616, Springer, Berlin (2014), 113–130. 10.1007/978-3-662-44371-2_7Suche in Google Scholar

[14] Gaži P., Pietrzak K. and Tessaro S., Tight bounds for keyed sponges and truncated CBC, preprint 2015, http://eprint.iacr.org/2015/053. Suche in Google Scholar

[15] Gorbunov S. and Rackoff C., On the security of cipher block chaining message authentication code, preprint 2016, https://cs.uwaterloo.ca/~sgorbuno/publications/securityOfCBC.pdf. Suche in Google Scholar

[16] International Organization for Standardization, Information technology – Xecurity techniques – Message authentication codes (MACs) – Part 1: Mechanisms using a block cipher, ISO/IEC 9797-1, Geneva, 1999. Suche in Google Scholar

[17] Iwata T. and Kurosawa K., OMAC: One-key CBC MAC, Fast Software Encryption (Lund 2003), Lecture Notes in Comput. Sci. 2887, Springer, Berlin (2003), 129–153. 10.1007/978-3-540-39887-5_11Suche in Google Scholar

[18] Iwata T. and Kurosawa K., Stronger security bounds for OMAC, TMAC, and XCBC, Progress in Cryptology – INDOCRYPT 2003, Lecture Notes in Comput. Sci. 2904, Springer, Berlin (2003), 402–415. 10.1007/978-3-540-24582-7_30Suche in Google Scholar

[19] Jaulmes É., Joux A. and Valette F., On the security of randomized CBC-MAC beyond the birthday paradox limit: A new construction, Fast Software Encryption (Leuven 2002), Lecture Notes in Comput. Sci. 2365, Springer, Berlin (2002), 237–251. 10.1007/3-540-45661-9_19Suche in Google Scholar

[20] Jutla C. S., PRF domain extension using DAGs, Theoryof Cryptography, Third Theory of Cryptography Conference (New York 2006), Lecture Notes in Comput. Sci. 3876, Springer, Berlin (2006), 561–580. 10.1007/11681878_29Suche in Google Scholar

[21] Kurosawa K. and Iwata T., TMAC: Two-key CBC MAC, IEICE Trans. 87-A (2004), 46–52. 10.1007/3-540-36563-X_3Suche in Google Scholar

[22] Maurer U. M., Indistinguishability of random systems, Advances in Cryptology – EUROCRYPT 2002, Lecture Notes in Comput. Sci. 2332, Springer, Berlin (2002), 110–132. 10.1007/3-540-46035-7_8Suche in Google Scholar

[23] Minematsu K. and Matsushima T., New bounds for PMAC, TMAC, and XCBC, Fast Software Encryption (Luxembourg 2007), Lecture Notes in Comput. Sci. 4593, Springer, Berlin (2007), 434–451. 10.1007/978-3-540-74619-5_27Suche in Google Scholar

[24] Nandi M., Fast and secure CBC-type MAC algorithms, Fast Software Encryption (Leuven 2009), Lecture Notes in Comput. Sci. 5665, Springer, Berlin (2009), 375–393. 10.1007/978-3-642-03317-9_23Suche in Google Scholar

[25] Nandi M., Improved security analysis for OMAC as a pseudorandom function, J. Math. Cryptol. 3 (2009), 133–148. 10.1515/JMC.2009.006Suche in Google Scholar

[26] Nandi M., A unified method for improving PRF bounds for a class of blockcipher based MACs, Fast Software Encryption (Seoul 2010), Lecture Notes in Comput. Sci. 6147, Springer, Berlin (2010), 212–229. 10.1007/978-3-642-13858-4_12Suche in Google Scholar

[27] Nandi M. and Mandal A., Improved security analysis of PMAC, J. Math. Cryptol. 2 (2008), 149–162. 10.1515/JMC.2008.007Suche in Google Scholar

[28] Patarin J., E,tude des générateurs de permutations pseudo-aléatoires basés sur le schéma du DES, Ph.D. thesis, Université de Paris, 1991. Suche in Google Scholar

[29] Patarin J., The “coefficients H” technique, Selected Areas in Cryptography (Sackville 2008), Lecture Notes in Comput. Sci. 5381, Springer, Berlin (2008), 328–345. 10.1007/978-3-642-04159-4_21Suche in Google Scholar

[30] Petrank E. and Rackoff C., CBC MAC for real-time data sources, J. Cryptology 13 (2000), 315–338. 10.1007/s001450010009Suche in Google Scholar

[31] Pietrzak K., A tight bound for EMAC, Automata, Languages and Programming. Part II (Venice 2006), Lecture Notes in Comput. Sci. 4052, Springer, Berlin (2006), 168–179. 10.1007/11787006_15Suche in Google Scholar

[32] Vaudenay S., Decorrelation: A theory for block cipher security, J. Cryptology 16 (2003), 249–286. 10.1007/s00145-003-0220-6Suche in Google Scholar

[33] Wegman M. N. and Carter L., New classes and applications of hash functions, 20th Annual Symposium on Foundations of Computer Science (San Juan 1979), IEEE Press, Piscataway (1979), 175–182. 10.1109/SFCS.1979.26Suche in Google Scholar

[34] Wigert S., Sur l’ordre de grandeur du nombre des diviseurs d’un entier, Ark. Mat. Astron. Fys. 3 (1907), no. 18, 1–9. Suche in Google Scholar

[35] Yasuda K., The sum of CBC MACs is a secure PRF, Topics in Cryptology – CT-RSA 2010, Lecture Notes in Comput. Sci. 5985, Springer, Berlin (2010), 366–381. 10.1007/978-3-642-11925-5_25Suche in Google Scholar

[36] Yasuda K., A new variant of PMAC: Beyond the birthday bound, Advances in Cryptology – CRYPTO 2011, Lecture Notes in Comput. Sci. 6841, Springer, Berlin (2011), 596–609. 10.1007/978-3-642-22792-9_34Suche in Google Scholar

[37] Zhang L., Wu W., Sui H. and Wang P., 3kf9: Enhancing 3GPP-MAC beyond the birthday bound, Advances in Cryptology – ASIACRYPT 2012, Lecture Notes in Comput. Sci. 7658, Springer, Berlin (2012), 296–312. 10.1007/978-3-642-34961-4_19Suche in Google Scholar

Received: 2016-5-24
Accepted: 2016-10-12
Published Online: 2016-11-8
Published in Print: 2016-12-1

© 2016 by De Gruyter

This article is distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Heruntergeladen am 29.9.2025 von https://www.degruyterbrill.com/document/doi/10.1515/jmc-2016-0030/html
Button zum nach oben scrollen