Home Mathematics The central tree property and algorithmic problems on subgroups of free groups
Article Publicly Available

The central tree property and algorithmic problems on subgroups of free groups

  • Mallika Roy , Enric Ventura and Pascal Weil ORCID logo EMAIL logo
Published/Copyright: February 14, 2024

Abstract

We study the average case complexity of the Uniform Membership Problem for subgroups of free groups, and we show that it is orders of magnitude smaller than the worst case complexity of the best known algorithms. This applies to subgroups given by a fixed number of generators as well as to subgroups given by an exponential number of generators. The main idea behind this result is to exploit a generic property of tuples of words, called the central tree property. An application is given to the average case complexity of the Relative Primitivity Problem, using Shpilrain’s recent algorithm to decide primitivity, whose average case complexity is a constant depending only on the rank of the ambient free group.

1 Introduction

Algorithmic problems have been prominent in the theory of infinite groups at least since Dehn formulated the word problem for finitely presented groups [7]: a finite group presentation A | R being fixed, the word problem asks whether a given element of the free group on 𝐴 (seen as a reduced word on the alphabet A A 1 ) is equal to the identity in the group presented by A | R .

A problem is called decidable if one can exhibit an algorithm that solves it. It is well known that the word problem is undecidable for certain finite group presentations (Novikov [16]). For decidable problems, it is natural to try to evaluate the complexity of an algorithm solving them, namely the amount of resources (time or space) required to run the algorithm, as a function of the size of the input.

The most common complexity evaluation for an algorithm 𝒜 is the worst case complexity, which measures the maximum time required to run 𝒜 on an input of size 𝑛. In certain cases, it may be relevant to consider the generic complexity of 𝒜: 𝒜 has generic complexity at most f ( n ) if the ratio of inputs of size 𝑛 on which 𝒜 requires time at most f ( n ) tends to 1 as 𝑛 tends to infinity. This notion of complexity recognizes that the instances that are hard for 𝒜 (the witnesses of the worst case complexity) may be few, but it does not attempt to quantify the time required on the instances in a vanishing set.

Here we will be concerned with the more precise average case complexity, namely the expected time required to run 𝒜 on size 𝑛 instances taken uniformly at random. This measure of complexity takes into account the resources needed for every input.

The specific problems we consider in this paper are the Uniform Membership Problem and the Relative Primitivity Problem in finite rank free groups. Recall that an element 𝑤 of F ( A ) , the free group on 𝐴, is primitive if F ( A ) admits a free basis containing 𝑤. The Uniform Membership Problem (resp., the Relative Primitivity Problem) is the following: given elements w 0 , w 1 , , w k of F ( A ) , decide whether w 0 belongs to (resp., is primitive in) the subgroup 𝐻 of F ( A ) generated by w 1 , , w k . In this paper, the length 𝑘 of the tuple ( w 1 , , w k ) is not fixed, and the parameters we consider to gauge the size of an instance are n = max { | w i | 1 i k } , 𝑘 as a function of 𝑛, and m = | w 0 | . In particular, the total length of an input ( w 0 , w 1 , , w k ) is at most k n + m .

It is well known that both these problems are decidable, and can be solved in polynomial time worst case complexity (see [20], [17, Fact 3.6], and Sections 4, 5 below). The most efficient solution (again, from the point of view of the worst case complexity) of the Uniform Membership Problem uses the concept of the Stallings graph of a subgroup, a finite 𝐴-labeled graph uniquely associated with a finitely generated subgroup of F ( A ) , which can be easily computed and then gives a linear time solution for the Uniform Membership Problem. More precisely, the Stallings graph of 𝐻 has at most k n vertices and it is computed in time O ( k n log * ( k n ) ) (see [21]). After this computation, deciding whether w 0 H is done in linear time in 𝑚.

Our main result is an algorithm solving the Uniform Membership Problem, whose average case complexity is asymptotically a little 𝑜 of the worst case complexity described above, at least when the number 𝑘 grows at most polynomially with 𝑛. It is notable that the dependance of our algorithm’s expected performance on 𝑚 (the length of the word w 0 to be tested) is extremely low.

A specific instance of our main result shows, for instance, that if 𝑘 is a constant, then the Uniform Membership Problem can be solved in expected time

O ( log n + m n log ( 2 r 1 ) ) ,

where r = | A | is taken to be constant.

The fundamental ingredient in this result is the so-called central tree property (ctp) for a tuple ( w 1 , , w k ) . This property, formally introduced in [2, 3], holds when the w i and their inverses have little initial cancellation (that is, they have short common prefixes). This property turns out to hold with high probability and, when it holds, solving the Uniform Membership Problem is considerably simpler than in the general case.

We then apply our result to the Relative Primitivity Problem: we give an algorithm solving it whose average case complexity is much lower than its worst case complexity. Here we use an algorithm recently proposed by Shpilrain [19] to solve the Primitivity Problem (deciding whether a given word w 0 is primitive in F ( A ) ). Shpilrain’s algorithm has the remarkable property of having constant average case complexity, that is, its expected time does not depend on the length of w 0 . As it turns out, this constant average case complexity depends on the rank of the ambient free group and this is important in the context of the Relative Primitivity Problem, where we need to test for primitivity in the subgroup generated by w 1 , , w k , whose rank may be as large as 𝑘.

The paper is organized as follows. Section 2 briefly discusses the fundamental notions on subgroups of free groups which we will use, especially the notions of the Stallings graph and the growth function of a subgroup, as well as the notion of average case complexity and the computational model which we rely upon. Section 3 is dedicated to the central tree property, applied to a tuple w = ( w 1 , , w k ) , its definition, its consequences in terms of the rank and the growth function of the subgroup generated by w , and the probability that it holds (in terms of the parameters 𝑘 and 𝑛). Our main result on the average case complexity of the Uniform Membership Problem is presented in Section 4. Finally, we discuss Shpilrain’s constant average case complexity algorithm for the Primitivity Problem, and our application to the Relative Primitivity Problem in Section 5.

2 Preliminaries

2.1 Subgroups of free groups

Throughout the paper, 𝐴 is a finite non-empty set, called an alphabet. We say that a directed graph Γ is an 𝐴-graph if its edges are labeled with elements of 𝐴. A rooted𝐴-graph is a pair ( Γ , v ) , where Γ is a finite 𝐴-graph and 𝑣 is a vertex of Γ. Finally, we say that a rooted 𝐴-graph ( Γ , v ) is reduced if Γ is finite and connected, distinct edges with the same start (resp., end) vertex always have distinct labels, and every vertex, except possibly the root 𝑣, is incident to at least two edges.

The set A ̃ = { a , a 1 a A } (with cardinality 2 | A | ) is called the (symmetrized) alphabet; its elements are called letters. We denote by A ̃ * the set of words on A ̃ , that is, of finite sequences of letters. We also denote by A * the set of words using only letters in 𝐴. A word in A ̃ * is said to be reduced if no letter a A is immediately preceded or followed by the letter a 1 .

For convenience, if A = { a 1 , , a k } , we let a i = a i 1 for 1 i k so that A ̃ = { a i k i k , i 0 } .

Suppose that w = x 1 x m is a word in F ( A ) (with each x i A ̃ ) and that p , q are vertices of a reduced 𝐴-graph Γ. We say that 𝑤labels a path in Γ from 𝑝 to 𝑞 if there exists a sequence of vertices p 0 = p , p 1 , , p m = q such that, for every 1 i m , Γ has an x i -labeled edge from p i 1 to p i if x i A , and an x i 1 -labeled edge from p i to p i 1 if x i 1 A . If the start and end vertices of the path are equal (that is, if p = q ), we say that the path is a circuit at 𝑝.

The free group on 𝐴 is written F ( A ) , and we identify it with the set of reduced words in A ̃ * . It is well known that every subgroup of F ( A ) is free [15]. It is also well known that every finitely generated subgroup 𝐻 of F ( A ) can be associated with a uniquely defined reduced rooted 𝐴-graph ( Γ ( H ) , 1 ) , called the Stallings graph of 𝐻, with the following property: a reduced word is in 𝐻 if and only if it labels a circuit in Γ ( H ) at vertex 1. We refer the reader to the seminal works of Serre [18] and Stallings [20] who introduced this combinatorial tool, and to [8, 12, 13, 14, 17] for some of its many applications.

Of particular interest for this paper are the following facts.

  • Given w = ( w 1 , , w k ) a tuple of reduced words in F ( A ) , one can effectively compute the Stallings graph ( Γ ( H ) , 1 ) of the subgroup 𝐻 generated by the w i , and this graph has at most n = i | w i | vertices. Touikan [21] showed that ( Γ ( H ) , 1 ) can be computed in time O ( n log * n ) . Recall that log * ( n ) is the least integer 𝑘 such that the 𝑘-th iterate of the logarithmic function yields a result less than or equal to 1, that is, such that log ( k ) ( n ) 1 < log ( k 1 ) ( n ) . Equivalently, let b 1 = 2 and b k + 1 = 2 b k . Then log * n = 1 if n b 1 and log * n = k if b k 1 < n b k .

  • Once Γ ( H ) is constructed, deciding whether a word w 0 F ( A ) is an element of 𝐻 is done by checking whether w 0 labels a circuit in Γ ( H ) at vertex 1. This can be done in time O ( | w 0 | ) .

  • Let 𝑉 and 𝐸 be the sets of vertices and edges of Γ ( H ) , respectively. Let 𝑇 be a spanning tree of Γ ( H ) (that is, a subgraph of Γ ( H ) which is a tree and contains every vertex of Γ ( H ) ). Let E T be the set of edges of 𝑇; then we have | E T | = | V | 1 . For each vertex 𝑝 of Γ ( H ) , let u ( p ) be the only reduced word which labels a path in 𝑇 from the root vertex 1 to vertex 𝑝 (with u ( 1 ) the empty word). For each edge 𝑒 of Γ ( H ) which is not in E T , say, 𝑒 is an edge from vertex p e to vertex q e with label a ( e ) A , let b ( e ) = u ( p e ) a ( e ) u ( q e ) 1 . Then b ( e ) is a reduced word in F ( A ) which labels a circuit at 1 so that b ( e ) H . Moreover, the set { b ( e ) e E E T } is a basis of 𝐻.

  • If 𝑇 is a fixed spanning tree of Γ ( H ) and 𝐵 is the corresponding basis of 𝐻, in bijection with E E T , the expression of an element of 𝐻 in that basis is obtained as follows. Given a reduced word 𝑤 in 𝐻, consider the circuit at 1 labeled by 𝐻 and the sequence of edges not in 𝑇 traveled by this circuit, say e 1 ε 1 , , e h ε h , where the e i are in E E T , ε i = 1 if e i is traversed in the direct sense, and ε i = 1 if e i is traversed backwards. Then w = b ( e 1 ) ε 1 b ( e h ) ε h .

  • One can compute a spanning tree of Γ ( H ) in time O ( | V | + | E | ) (by classical depth-first search). Since | E | | V | | A | , it follows that one can compute a spanning tree of Γ ( H ) , and therefore a basis of 𝐻, in time O ( | V | | A | ) .

  • The subgroup 𝐻 has finite index if and only if every vertex of Γ ( H ) is the origin (and so the terminus) of an 𝑎-labeled edge for every letter a A or, equivalently, if | E | = | V | | A | . In that case, the index of 𝐻 is | V | .

2.2 Growth modulus

The growth function of a set 𝐿 of words over an alphabet 𝐴 is the function s L ( n ) counting the words of length 𝑛 in 𝐿. The following result belongs to the folklore of combinatorial automata theory.

Fact 2.1

If 𝐿 is a regular language, then its growth function (restricted to its support, which is the complement of an ultimately periodic sequence) is asymptotically equivalent to an expression of the form C n k λ n , with C > 0 , k N and λ 1 .

The real number 𝜆 in Fact 2.1 is called the growth modulus of 𝐿.

Proof

A result usually attributed to Chomsky and Schützenberger [6] (see [9, Proposition I.3] for a quick proof) states that the generating function of 𝐿,

S L ( z ) = n 0 s L ( n ) z n ,

is a rational fraction. The partial fraction decomposition (over the reals) of this rational fraction yields the stated asymptotic equivalent for the coefficients s L ( n ) of S L ( z ) . ∎

We record the following elementary remark.

Remark 2.2

If a language 𝐿 has at most C λ n words of length 𝑛 ( C > 0 , λ > 1 ), then the number of its words of length less than or equal to 𝑛 is at most C λ λ 1 λ n , which is Θ ( λ n ) .

The Perron–Frobenius theorem (see, e.g., [11, Theorem 8.4.4]) yields a more precise characterization of the growth modulus of a regular language. If 𝒜 is a finite state automaton over alphabet 𝐴 with state set 𝑄, we denote by G A the graph with vertex set 𝑄 and with an edge from vertex 𝑝 to vertex 𝑞 for every letter a A labeling a transition from 𝑝 to 𝑞. We also let M A be the associated incidence ( Q × Q ) -matrix. Note that M A has only non-negative integer coefficients. We say that 𝒜 is irreducible if M A is, that is, if G A is strongly connected.

The period of 𝒜 is defined to be the largest positive integer 𝑑 such that 𝑄 can be partitioned as Q = Q 1 Q d in such a way that every transition from a state in Q i leads to a state in Q i + 1 (indices are taken modulo 𝑑). For instance, the minimal automaton of the language of words of length a multiple of 𝑑 has period 𝑑. Finally, we say that 𝒜 is aperiodic if its period is 1.

The following result also belongs to the folklore; see [9, Proposition V.7].

Proposition 2.3

Let 𝐿 be a regular language, accepted by a deterministic finite state automaton 𝒜 which is irreducible and aperiodic. The growth modulus of 𝐿 is equal to the dominant eigenvalue of the transition matrix of 𝒜.

The growth modulus of the free group F ( A ) is easily computed.

Example 2.4

It is clear that, for every integer n 1 , the number R n of reduced words of length 𝑛 is 2 r ( 2 r 1 ) n 1 , where r = | A | .

Recall that we identify F ( A ) with the set of reduced words over the alphabet A ̃ . Thus the growth modulus of F ( A ) is 2 r 1 .

2.3 Algorithmic problems: Average case complexity

In evaluating the complexity of an algorithm, one needs to specify the model of computation and the input space. The input space is usually equipped with a notion of size (a positive integer) such that there are finitely many inputs of any given size. It is important, in particular, to make clear which parameters of the problem are taken to be constants. Unless otherwise indicated, we consider in this paper that the rank r = | A | of the ambient free group is a constant.

The model of computation we adopt is the standard RAM model. Concretely, this means that if a word 𝑤 is part of the input, it takes unit time to move the reading head to a position 𝑖 (where 𝑖 has been computed before), and unit time to read the letter of 𝑤 in position 𝑖. Arithmetic operations on integers (addition, multiplication) are also considered as taking unit time.

Remark 2.5

In the standard RAM model, figuring out the length of an input word 𝑤 (in order, for instance, to read its last letter) takes time O ( log | w | ) , using an instance of the so-called exponentiation search method [4]: one reads letters in positions 1, 2, 4, 8, 16, etc., until one exceeds the length of the word after, say, 𝑏 steps. At that point, we know that 2 b 1 | w | < 2 b , that is, we know the leading bit of the binary expansion of | w | . The next bits are established by a classical dichotomy method. Concretely, probing position c = 2 b 1 + 2 b 2 allows us to know the second bit: 1 if position 𝑐 is still in the word, 0 if 𝑐 exceeds its length. This is repeated for the successive bits of the binary expansion of | w | .

Since we are going to work with algorithms with very low, even constant, average case complexity, we do not want to have to add the logarithmic time needed to compute the length of input words, and will therefore accompany every input word with its length.

Remark 2.6

When handling “very large” integers, or words on a “very large” alphabet, it may be more appropriate to use the bitcost model: adding integers takes time linear in the length of their binary representations (that is, in their logarithm), and reading or comparing letters from a large alphabet 𝐴 takes time proportional to log | A | (since each letter can be encoded in a bit string of length log | A | ).

The worst case complexity of an algorithm𝒜 is the function f ( n ) , defined on ℕ, which accounts for the maximum time required to run algorithm 𝒜 on an input of size 𝑛. If a distribution is specified on the set of size 𝑛 inputs (in this paper, the uniform distribution), the average case complexity of 𝒜 is the function g ( n ) which computes the expected time required to run 𝒜 on inputs of size 𝑛. Such complexity functions are usually considered up to asymptotic equivalence.

The average case complexity of an algorithm is obviously bounded above by the worst case complexity, and it may sometimes be much lower. The usual idea in discussing average case complexity is to distinguish, within the input space, between a subset of high probability where the algorithm performs very fast, and its low-probability complement, containing all the hard instances (those which witness the worst case complexity).

Finally, the worst case (resp., average case) complexity of a problem is the lowest worst case (resp., average case) complexity of an algorithm solving this problem.

A well-known example which will be useful in the sequel, is the Proper Prefix Problem (PPP) on alphabet 𝐴: given two words u , v on alphabet 𝐴, decide whether 𝑢 is a proper prefix of 𝑣 (that is, v = u u for some non-empty word u ). We also consider the Prefix Problem (PP) (deciding whether 𝑢 is a prefix of 𝑣) and the Equality Problem (EqP) (deciding whether u = v ). The following observation is elementary: it is just an instance of the fact that the expected value of a geometric distribution is constant.

Lemma 2.7

Let 𝐴 be a finite alphabet with | A | 2 and suppose that each set A n (the words of length 𝑛) is equipped with the uniform distribution. Problems PPP, PP and EqP can be solved in constant expected time. The same holds for the set R n of (reduced) words of length 𝑛 in F ( A ) .

Proof

We prove the statement for PPP. The proofs for PP and EqP are entirely similar. Here is a simple and natural algorithm solving the PPP.

Algorithm P P P

On input u , v , read 𝑢 and 𝑣 from left to right one letter at a time, comparing each letter of 𝑢 with the corresponding letter of 𝑣, and stopping when (i) a difference is detected (that is, the 𝑖-th letters of 𝑢 and 𝑣 are different for some 𝑖); (ii) the end of 𝑢 is reached but not the end of 𝑣; or (iii) the end of 𝑣 is reached.

It is clear that 𝑢 is a proper prefix of 𝑣 in case (ii), and not a proper prefix of 𝑣 in every other case. That is, Algorithm P P P solves the PPP. We now show that it has constant average case complexity. In effect, this is due to the fact that, with high probability, we will detect a difference between the words 𝑢 and 𝑣 without having to read either word to its end.

Let u i (resp., v i ) denote the 𝑖-th letter in 𝑢 (resp., 𝑣). At each step, verifying whether u i = v i is done in constant time, and it is the case with probability p = 1 | A | . Therefore, the probability that the algorithm stops after exactly 𝑘 steps (with k | u | , | v | ) is p k 1 ( 1 p ) . Detecting whether we reached the end of 𝑢 or 𝑣 is also done in constant time. It follows that the average case complexity of Algorithm P P P is bounded above, up to a multiplicative constant, by

1 + p + + p | u | 1 1 1 p = | A | | A | 1 2 ,

and this concludes the proof relative to A * .

To transfer this result to F ( A ) , one can for instance rewrite the words in F ( A ) as follows. Let X = { x 1 , , x 2 r 1 } and A ̃ = { a r , , a 1 , a 1 , , a r } be ordered in the natural way. For each b A ̃ , we let ζ b be the order isomorphism from A ̃ { b } to 𝑋.

Given a word w = b 1 b n F ( A ) of length n 2 , we let τ ( w ) = b 1 c 2 c n be the word in A ̃ X n 1 , where c i + 1 = ζ b i 1 ( b i + 1 ) for every 1 i < n . It is clear that 𝜏 is a bijection between R n and A ̃ X n 1 , and hence it preserves the uniform distribution.

The result follows if we modify Algorithm P P P as follows: on input 𝑢 and 𝑣, the algorithm first compares u 1 and v 1 , and then it compares ζ u i 1 ( u i + 1 ) and ζ v i 1 ( v i + 1 ) until a difference is detected. ∎

Remark 2.8

Lemma 2.7 establishes that the expected running time of Algorithm P P P is bounded by a constant. This relies on our choice of the RAM model of computation, according to which it takes constant time to compare two letters in 𝐴. As mentioned in Remark 2.6, if 𝐴 is very large, it may take non-trivial time to perform such a comparison (namely time O ( log | A | ) ) and, in that case, the expected running time of Algorithm P P P is O ( log | A | ) .

3 The central tree property: A generic property of tuples of words

If 𝑑 is a positive integer, we say that the 𝑘-tuple w = ( w 1 , , w k ) of words in F ( A ) has the central tree property of depth 𝑑 (the 𝑑-ctp for short) if the w i have length greater than 2 d and the prefixes of length 𝑑 of the w i and the w i 1 are pairwise distinct. We also say that w has the ctp if it has the 𝑑-ctp for some d < 1 2 min { | w i | 1 i k } . The central tree property was formally introduced in [2] (see also [3]), but it was implicit in the literature, especially on the (exponential genericity of the) small cancellation property since the ctp can be viewed as a small initial cancellation property.

Let w = ( w 1 , , w k ) be a 𝑘-tuple of words in F ( A ) . For convenience, we let min | w | = min { | w i | 1 i k } and max | w | = max { | w i | 1 i k } , and we write w i for w i 1 ( 1 i k ).

Suppose that w has the 𝑑-ctp. Then we let pr i be the length 𝑑 prefix of w i and let mf d ( w i ) be the middle factor of w i of length | w i | 2 d . In particular, we have mf d ( w i ) = mf d ( w i ) 1 and

(3.1) w i = pr i mf d ( w i ) pr i 1

for k i k , i 0 .

Denote by L ( w ) the set of all the pr i . By definition of the 𝑑-ctp, | L ( w ) | = 2 k . Let also Γ d ( w ) be the tree of prefixes of the w i and w i 1 (rooted at the empty word): this is the graph with vertices all the prefixes of the words w i and w i 1 ( 1 i k ), including the empty word, and with an edge from a word 𝑣 to a word 𝑤 exactly if w = v a ( a A ̃ ). We identify the words in L ( w ) with the corresponding leaves of Γ d ( w ) . If H = w (i.e., H = w 1 , , w k ), then Γ ( H ) consists of the central tree Γ d ( w ) , together with 𝑘 disjoint paths: for every 1 i k , there is such a path from vertex pr i to vertex pr i , labeled mf d ( w i ) . In view of equation (3.1), the word w i labels a circuit at the root in Γ ( H ) , going first through Γ d ( w ) to the leaf pr i , and returning to the root through the leaf pr i .

Remark 3.1

It follows directly from the definition that one can decide whether a 𝑘-tuple w has the 𝑑-ctp, and construct Γ d ( w ) , in time O ( k d ) .

We record the following property of 𝑘-tuples with the ctp, the first of which is [2, Lemma 1.2].

Proposition 3.2

Let d 1 , let w be a tuple of words in F ( A ) with the 𝑑-ctp and let H = w . Then 𝐻 has infinite index and w is a basis of 𝐻.

Let μ = min | w | and ν = max | w | . If w 0 is a word in F ( A ) which belongs to 𝐻, then the length ℓ of the expression of w 0 in the basis w satisfies

( μ 2 d ) | w 0 | ν .

Proof

The infinite index property is immediately verified since Γ ( H ) has vertices of degree 2, namely the leaves of Γ d ( w ) ; see Section 2.1.

Let X = { x 1 , , x k } be a 𝑘-letter alphabet and let φ : F ( X ) F ( A ) be the morphism given by φ ( x i ) = w i . It is obvious that the image of 𝜑 is 𝐻. Recall that we let x i = x i 1 for every 1 i k . Let x = x i 1 x i be a non-empty word in F ( X ) . Then φ ( x ) is the word obtained by reducing w i 1 w i . Because of the ctp, reduction occurs only in segments of length at most 2 d around the boundary between w i h and w i h + 1 for each 1 h < . In particular,

μ 2 d ( 1 ) | φ ( x ) | ν .

It follows that φ ( x ) 1 , and hence 𝜑 is injective and w is a basis of 𝐻. ∎

It is well known that an infinite index subgroup 𝐻 of F ( A ) has growth modulus smaller than 2 r 1 . In the case where 𝐻 is generated by a tuple with the ctp, its growth modulus is greatly constrained.

Proposition 3.3

Let w be a 𝑘-tuple of words with the 𝑑-ctp and let μ = min | w | . Let H = w . Then the growth modulus of 𝐻 is at most ( 2 k 1 ) 1 μ 2 d .

Proof

Let 𝑋 and 𝜑 be as in the proof of Proposition 3.2, let x F ( X ) and w = φ ( x ) H . Then ( μ 2 d ) | x | | w | . If | w | = m , then | x | m μ 2 d . In particular, the set of words of 𝐻 of length 𝑚 is contained in the 𝜑-image of the words in F ( X ) of length at most m μ 2 d . This set has cardinality Θ ( ( 2 k 1 ) m μ 2 d ) ; see Remark 2.2. The stated inequality follows since 𝜑 is a bijection between F ( X ) and 𝐻. ∎

We also record the following fact, which is elementarily verified and is well known (see, e.g., [1]).

Proposition 3.4

Let k 1 . The probability for a 𝑘-tuple w of words in F ( A ) of length at most 𝑛 to satisfy min | w | n 2 is O ( k ( 2 r 1 ) n / 2 ) .

Proof

The set of words of length ℎ is 2 r ( 2 r 1 ) h 1 , so the set of words of length at most ℎ is

1 + 2 r + 2 r ( 2 r 1 ) + + 2 r ( 2 r 1 ) h 1 = r r 1 ( 2 r 1 ) h 1 r 1 .

The probability that a word in F ( A ) of length at most 𝑛 actually has minimal length at most n 2 is therefore asymptotically equivalent to

C ( 2 r 1 ) n / 2 n = C ( 2 r 1 ) n / 2

for some constant 𝐶, and the result follows. ∎

We are interested in the d ( n ) -ctp, where d ( n ) is an increasing function of 𝑛. The following statement is derived from [2].

Proposition 3.5

Let r = | A | 2 and let k 2 be an integer. Let d ( n ) be a non-decreasing function of 𝑛 such that d ( n ) < n 2 . A random 𝑘-tuple of words in F ( A ) of length at most 𝑛 fails the d ( n ) -ctp with probability O ( k 2 ( 2 r 1 ) d ( n / 2 ) ) .

Proof

Let η n = k 2 ( 2 r 1 ) d ( n / 2 ) . It is shown in [2, proof of Proposition 3.17][1] that the probability that a 𝑘-tuple w = ( w 1 , , w k ) of words in F ( A ) of length at most 𝑛 fails to have the d ( n ) -ctp is bounded above by the sum of 5 η n and the probability that k 2 ( 2 r 1 ) d ( min | w | ) > η n .

We note that, if min | w | > n 2 , then for all 𝑛, we have

k 2 ( 2 r 1 ) d ( min | w | ) k 2 ( 2 r 1 ) d ( n / 2 ) = η n .

Therefore, the probability that w fails to have the d ( n ) -ctp is bounded above by the sum of the probability that min | w | n 2 , which is O ( k ( 2 r 1 ) n / 2 ) by Proposition 3.4, and 5 η n = 5 k 2 ( 2 r 1 ) d ( n / 2 ) . This concludes the proof since k < k 2 and d ( n 2 ) < n 4 . ∎

If the size 𝑘 of the tuple of words is itself a function of 𝑛, Proposition 3.5 directly yields the following statement.

Corollary 3.6

Let r = | A | 2 and let k ( n ) be an integer function such that k ( n ) ( 2 r 1 ) n / 2 .

  1. If k ( n ) is a constant function, then a random k ( n ) -tuple of words in F ( A ) of length at most 𝑛 fails the log n -ctp with probability O ( n log ( 2 r 1 ) ) .

  2. If θ > 0 , k ( n ) = n θ and 0 < γ < 1 , then a random k ( n ) -tuple of words in F ( A ) of length at most 𝑛 fails the ( 2 n ) γ -ctp with probability

    O ( n 2 θ ( 2 r 1 ) n γ ) .

  3. At density 𝛽 (that is, if k ( n ) = ( 2 r 1 ) β n ) for 0 < β < 1 8 , and if 4 β < γ < 1 2 , then a random k ( n ) -tuple of words in F ( A ) of length at most 𝑛 fails the γ n -ctp with probability O ( ( 2 r 1 ) ( 4 β γ ) n / 2 ) .

4 The Uniform Membership Problem

The Uniform Membership Problem (UMP) on alphabet 𝐴 and for an integer k 1 is the following: given w 0 , a word in F ( A ) , and w = ( w 1 , , w k ) , a 𝑘-tuple of words in F ( A ) , decide whether w 0 belongs to the subgroup 𝐻 generated by w .

The notion of Stallings graphs provides a nice algorithmic solution for the UMP, and to its extension where we also ask for the expression of w 0 in a basis of 𝐻, if w 0 H (see [20]).

Algorithm M P

On input a pair ( w 0 , w ) of a reduced word and a 𝑘-tuple of words in F ( A ) ,

  1. compute the Stallings graph Γ ( H ) of H = w .

  2. Compute a spanning tree 𝑇 of Γ ( H ) (which specifies a basis 𝐵 of 𝐻; see Section 2.1).

  3. Try reading w 0 as a label of a path in Γ ( H ) starting at the root vertex, keeping track of the sequence of edges traversed in the complement of 𝑇. If one can indeed read w 0 in this fashion and the resulting path is a circuit, then w 0 H and the sequence of edges not in 𝑇 yields the reduced expression of w 0 in basis 𝐵 (see Section 2.1); otherwise w 0 H .

Remark 4.1

We do not assume the length 𝑘 of the tuple w to be a constant. We will see it instead as a function of n = max | w | .

Proposition 4.2

The (worst case) complexity of Algorithm M P is

O ( k n log * ( k n ) + r k n + m ) ,

where n = max | w | , m = | w 0 | , and r = | A | .

Proof

This is a direct application of Section 2.1. ∎

Remark 4.3

If we only want to know whether w 0 H , step (2) can be skipped, and the complexity is O ( k n log * ( k n ) + m ) .

Let us now consider Algorithm M P in more detail, in the case when w has the 𝑑-ctp for some 𝑑. In this situation, we can exploit the shape of Γ ( H ) described in Section 3. In particular, we get a spanning tree by removing one edge from the mf d ( w i ) -labeled path from pr i to pr i for each 1 i k , and the corresponding basis is w . It does not, actually, matter which edge is removed. As we will see, we do not even need to compute explicitly the full picture of Γ ( H ) .

Let 𝑋 be the 𝑘-letter alphabet X = { x 1 , , x k } and let φ : F ( X ) F ( A ) be the morphism which maps letter x i to w i , as in the proof of Proposition 3.2.

If w 0 H , then step (3) of Algorithm M P starts with an initialization step, identifying the first letter x i 1 of the expression x 0 of w 0 in basis w and reading | w i 1 | d letters of w 0 , followed by a potentially iterated step, which identifies the next letter in x 0 . The important observation is that, along each of these steps, a long factor of w 0 must be read (of length at least min | w | 2 d ), and this factor must match one of a fixed collection of at most 2 k words.

This leads to the family of Algorithms M P d below, indexed by functions d : N N , n d ( n ) , each of which solves the UMP. We then prove that, for well-chosen 𝑑, the average case complexity of Algorithm M P d is much lower than the worst case complexity of Algorithm M P .

Let d ( n ) be a non-decreasing function of 𝑛 such that d ( n ) < n 2 .

Algorithm M P d

The input is the ( k + 1 ) -tuple ( w 0 , , w k ) of words in F ( A ) . We let w = ( w 1 , , w k ) and H = w . No assumption (in particular, no ctp assumption) is made about the input. For convenience, we assume that we are also given the 𝑘-tuple of lengths ( | w 1 | , , | w k | ) (see Remark 2.5), and we let n = max | w | . We again let X = { x 1 , , x k } and φ : F ( X ) F ( A ) be given by φ ( x i ) = w i ( 1 i k ).

  1. Decide whether w has the d ( n ) -ctp and min | w | > n 2 . This decision requires computing the set L ( w ) of length d ( n ) prefixes of the w i and w i 1 . This set is recorded in the form of the tree Γ d ( n ) ( w ) , which has at most 2 k d ( n ) vertices and edges. There are two cases.

    1. If w has the d ( n ) -ctp and min | w | > n 2 , go to step (2).

    2. Otherwise, run Algorithm M P to decide whether w 0 H , and find an expression of w 0 in a basis of 𝐻 if it does.

  2. Start reading w 0 in Γ d ( n ) ( w ) from the root vertex. There are two cases.

    1. We reach a leaf of Γ d ( n ) ( w ) , say pr i ( k i k , i 0 , necessarily after reading exactly d ( n ) letters from w 0 ), and the middle factor mf d ( n ) ( w i ) is a proper prefix of the suffix of w 0 starting in position d ( n ) + 1 , that is, pr i mf d ( n ) ( w i ) is a proper prefix of w 0 . In this case, move the reading head to position d ( n ) + | mf d ( n ) ( w i ) | + 1 = | w i | d ( n ) + 1 in w 0 , record pr i as the last-leaf-visited, output letter x i X ̃ and go to step (3).

    2. Otherwise, stop the algorithm and conclude that w 0 H .

  3. Suppose that the reading head on w 0 is in position 𝑗 and that the last-leaf-visited is pr i . Resume reading w 0 (from position 𝑗) in Γ d ( n ) ( w ) , starting at pr i . There are three cases.

    1. We reach the end of w 0 while reading it inside Γ d ( n ) ( w ) , landing at the root vertex. In this case, stop the algorithm and conclude that w 0 H .

    2. After reading d letters from w 0 , we reach a leaf of Γ d ( n ) ( w ) , say pr i (necessarily 2 d 2 d ( n ) and i i ), and the word mf d ( n ) ( w i ) is a proper prefix of the suffix of w 0 starting in position j + d + 1 . In this case, move the reading head to position j + d + ( | w i | 2 d ( n ) ) + 1 in w 0 , record pr i as the last-leaf-visited, output letter x i X ̃ and repeat step (3).

    3. Otherwise, stop the algorithm and conclude that w 0 H .

Theorem 4.4

Let d ( n ) be a non-decreasing function of 𝑛 such that d ( n ) < n 2 . Algorithm M P d solves the Uniform Membership Problem in F ( A ) and, if w 0 H , finds an expression of w 0 in a basis of 𝐻.

Let r = | A | 2 , 0 < δ < 1 4 and 0 < β < 1 2 2 δ , and suppose d ( n ) δ n . If we restrict the input space to pairs of the form ( w 0 , w ) , where max | w | = n and w is a tuple of length k ( 2 r 1 ) β n , then the average case complexity of Algorithm M P d is

O ( k d ( n ) + k 3 n ( 2 r 1 ) d ( n / 2 ) ( r + log * ( k n ) ) + k 2 ( 2 r 1 ) d ( n / 2 ) m ) ,

where m = | w 0 | . If the space of inputs is further restricted to those inputs where w has the d ( n ) -ctp and min | w | > n 2 , then the expected running time is O ( k d ( n ) ) , independent of | w 0 | .

Proof

Algorithm M P d always stops because every one of its steps takes a finite amount of time and the only repeated step (step (3)) reads a positive number of letters of w 0 . If Algorithm M P d stops at step (1), then it answers the question whether w 0 H and, in the affirmative case, finds an expression for it in a basis of 𝐻. If it stops at step (2), then w 0 H . And if it stops at step (3), then Algorithm M P d outputs a word x 0 on alphabet X ̃ , one letter at a time (one at the completion of step (2) and one at the completion of each iteration of step (3) except for the last one). As observed in the description of step (3), each new letter cannot be the inverse of the preceding one (because Γ d ( n ) ( w ) is a tree) so that word x 0 is always reduced, that is, x 0 F ( X ) . Moreover, the last iteration of step (3) concludes either that w 0 H , or that w 0 H and x 0 = φ 1 ( w 0 ) (that is, x 0 is the expression of w 0 in basis w ).

We now proceed with bounding the expected running time of Algorithm M P d . We use the following notation:

μ = min | w | , p = 2 k ( 2 r 1 ) n / 2 + d ( n ) , q = 2 k ( 2 r 1 ) 2 n / 2 + 2 d ( n ) = ( 2 r 1 ) d ( n ) 2 p .

Our hypotheses on 𝑘 and 𝑑 imply that both 𝔭 and 𝔮 tend to 0 as 𝑛 tends to infinity.

Step (1) first requires comparing the lengths of w 1 , , w k with n 2 , deciding whether w has the d ( n ) -ctp and, if so, computing Γ d ( n ) ( w ) . This takes time O ( k d ( n ) ) (see Remark 3.1). If w does not have the d ( n ) -ctp or if μ n 2 , step (1) runs Algorithm M P , in time O ( k n log * ( k n ) + r k n + m ) (see Proposition 4.2). By Propositions 3.4 and 3.5, this happens with probability O ( k 2 ( 2 r 1 ) d ( n / 2 ) ) (since we assumed that d ( n ) < n 2 , and hence k ( 2 r 1 ) n / 2 < k 2 ( 2 r 1 ) d ( n / 2 ) ).

With the complementary probability, w has the d ( n ) -ctp, μ > n 2 and Algorithm M P d proceeds to step (2).

We now need to decide in which of the two cases of step (2) we are, that is, we need to solve the PPP (Proper Prefix Problem) 2 k times: for every pair of input words ( u , w 0 ) , where u = pr i mf d ( n ) ( w i ) for some k i k , i 0 . The expected time for this is O ( k ) (see Lemma 2.7). Note that, by the ctp, the output will be positive for at most one of these 𝑢, thus uniquely identifying the leaf pr i of Γ d ( n ) ( w ) which is first visited when reading w 0 . Moreover, the probability that the algorithm does not stop here – and therefore moves to step (3) –, that is, the probability that one of these words 𝑢 is indeed a proper prefix of w 0 is[2] O ( 2 k ( 2 r 1 ) μ + d ( n ) ) . As μ > n 2 , this probability is O ( p ) .

Thus, with probability O ( p ) , we enter a loop where step (3) is repeated. Consider one such iteration of step (3), starting with the reading head in position 𝑗 on w 0 and vertex pr i as the last-leaf-visited. Let w 0 be the suffix of w 0 starting at position 𝑗. To decide in which of the cases of step (3) we are, we first consider whether | w 0 | = j + d ( n ) , and if so, we solve the EqP (Equality Problem) on input ( pr i 1 , w 0 ) . If indeed w 0 = pr i 1 , the algorithm stops and concludes that w 0 H . This is done in constant expected time. If w 0 pr i 1 , we solve the PPP 2 k 1 times for every input pair ( u , w 0 ) , where u = pr i 1 pr i mf d ( n ) ( w i ) and k i k , i 0 , i . By Lemma 2.7, this is done in expected time O ( k d ( n ) ) (the factor d ( n ) corresponds to the work needed to reduce pr i 1 pr i before solving the PPP). The probability that the algorithm continues to a new iteration of step (3), namely the probability that one of these words 𝑢 is a proper prefix of w 0 , is O ( ( 2 k 1 ) ( 2 r 1 ) ( 2 + μ 2 d ( n ) ) ) . Since μ > n 2 , we have

( 2 k 1 ) ( 2 r 1 ) ( 2 + μ 2 d ( n ) ) q .

The expected time required for running Algorithm M P d can be analyzed as follows. Step (1) runs in expected time

O ( k d ( n ) ) + O ( k 2 ( 2 r 1 ) d ( n / 2 ) ( k n log * ( k n ) + r k n + m ) ) = O ( k d ( n ) + k 3 n ( 2 r 1 ) d ( n / 2 ) ( r + log * ( k n ) ) + k 2 ( 2 r 1 ) d ( n / 2 ) m ) .

With probability 1 O ( k 2 ( 2 r 1 ) d ( n / 2 ) ) , Algorithm M P d proceeds to step (2).

Step (2) runs in expected time O ( k ) . With probability O ( p ) , the algorithm proceeds to step (3) and stops with the complementary probability.

Each iteration of step (3) runs in expected time O ( k d ( n ) ) . Step (3) is repeated with probability O ( q ) , and the algorithm stops with the complementary probability.

It follows that the expected running time of step (2) and the ensuing iterations of step (3) is O ( k ( 1 + p d ( n ) ( 1 + q + q 2 + q 3 + ) ) ) , which is

O ( k d ( n ) ( 1 + p 1 q ) ) .

Since 𝔭 and 𝔮 tend to 0, this is O ( k d ( n ) ) , independently of how many times step (3) is iterated. The expected running time of Algorithm M P d is therefore at most

O ( k d ( n ) + k 3 n ( 2 r 1 ) d ( n / 2 ) log * ( k n ) + k 2 ( 2 r 1 ) d ( n / 2 ) m ) .

Finally, suppose that the input ( w 0 , w ) is such that min | w | > n 2 and w has the d ( n ) -ctp. Then step (1) consists only in computing Γ d ( n ) ( w ) . The expected running time of Algorithm M P d is therefore, on this smaller set of inputs, O ( k d ( n ) ) . ∎

As a corollary, we get upper bounds on the average case complexity of the Uniform Membership Problem.

Corollary 4.5

The Uniform Membership Problem (UMP) for F ( A ) , with input a k ( n ) -tuple of words of length at most 𝑛, and an additional word of length 𝑚, can be solved in expected time C ( n , m ) as follows (where r = | A | is taken to be constant).

  1. If 𝑘 is constant, then C ( n , m ) = O ( log n + m n log ( 2 r 1 ) ) , improving on its worst case complexity, namely O ( n log * n + m ) .

  2. Let β > 0 , 0 < γ < 1 . If k = n β , then

    C ( n , m ) = O ( n β + γ + m n 2 β ( 2 r 1 ) n γ ) ,

    improving on its worst case complexity, namely O ( n β + 1 log * n + m ) .

  3. If k = n β for some β > 0 , we also have

    C ( n , m ) = O ( n β log n + n log * n + m n β ) .

  4. For any 0 < β < 1 18 , if k = ( 2 r 1 ) β n , then for every 0 < ε < 1 8 9 4 β ,

    C ( n , m ) = O ( n ( 2 r 1 ) β n + m ( 2 r 1 ) ( 9 4 β 1 8 + ε ) n ) ,

    improving on its worst case complexity, namely O ( n ( 2 r 1 ) β n log * n + m ) .

Proof

The worst case complexities mentioned in each item follow from Proposition 4.2. For every item, we apply Theorem 4.4 for an appropriate choice of the function d ( n ) .

(1) Suppose that 𝑘 is a constant function and let d ( n ) = log n . Note that

( 2 r 1 ) log n = n log ( 2 r 1 ) and lim n n 1 log ( 2 r 1 ) ( r + log * ( k n ) ) = 0 .

Theorem 4.4 then shows that the average case complexity of Algorithm M P d is O ( log n + m n log ( 2 r 1 ) ) , as stated.

(2) Suppose now that k = n β and let d ( n ) = ( 2 n ) γ . We can, again, apply Theorem 4.4. Since lim n n 3 β + 1 ( 2 r 1 ) n γ ( r + log * ( n β + 1 ) ) = 0 , the average case complexity of Algorithm M P d is O ( n β + γ + m n 2 β ( 2 r 1 ) n γ ) , as stated.

(3) Suppose, again, that k = n β and let

d ( n ) = 3 β log ( 2 r 1 ) log ( 2 n ) .

Then ( 2 r 1 ) d ( n / 2 ) = n 3 β . Then Theorem 4.4 shows that, in this case, the average case complexity of Algorithm M P d is O ( n β log n + n log * n + m n β ) .

(4) Suppose that k = ( 2 r 1 ) β n with 0 < β < 1 18 . This inequality guarantees that 4 β < 1 4 β 2 . Let 0 < ε < 1 8 9 4 β and δ = 1 4 β 2 2 ε (so that δ > 4 β ), and let d ( n ) = δ n . Then the hypotheses of Theorem 4.4 are satisfied with β = β and δ = δ . As a result, the average case complexity of Algorithm M P d is

O ( n ( 2 r 1 ) β n + n ( 2 r 1 ) ( 3 β δ 2 ) n log * n + m ( 2 r 1 ) ( 2 β δ 2 ) n ) .

Since 4 β < δ , we have 3 β δ 2 < β , and the second summand is less than the first. Moreover, 2 β δ 2 = 9 4 β 1 8 + ε , and the stated result follows. ∎

5 The Primitivity and the Relative Primitivity Problems

An element 𝑤 of a free group F ( A ) is said to be primitive (in F ( A ) ) if F ( A ) admits a basis containing 𝑤. Equivalently, 𝑤 is primitive if the cyclic subgroup w is a free factor of F ( A ) . The Primitivity Problem (PrimP) on alphabet 𝐴 consists in deciding, given a word w F ( A ) , whether 𝑤 is primitive in F ( A ) .

The Primitivity Problem is closely related to the following Whitehead problem: given two words v , w F ( A ) , decide whether there exists an automorphism 𝜑 of F ( A ) such that φ ( w ) = v . The first step in Whitehead’s classical solution to this problem [23] identifies the minimal length of the automorphic images of 𝑣 and 𝑤. The classical solution of PrimP is a by-product of this first step: a word is primitive if and only if its orbit under the action of Aut ( F ( A ) ) contains a word (and so all words) of length 1. This solution of PrimP is linear in m = | w | , but exponential in r = | A | (relying, as it does, on an exploration of the action of the Whitehead automorphisms, whose number is exponential in 𝑟).

Roig, Ventura and Weil [17, Fact 3.6] modified Whitehead’s algorithm, resulting in Algorithm 𝒫 which solves the Primitivity Problem in time O ( m 2 r 3 ) , where m = | w | . We do not describe Algorithm 𝒫 in this paper as we will use it as a black box.

The Relative Primitivity Problem (RPrimP) on alphabet 𝐴 and for an integer k 1 is the following: given a word w 0 in F ( A ) and a 𝑘-tuple w = ( w 1 , , w k ) of words in F ( A ) , decide whether w 0 belongs to H = w and, if it does, whether it is primitive in 𝐻.

Solving RPrimP is done naturally by the combination of an algorithm solving the Uniform Membership Problem and, in the case of affirmative answer, computing the expression x 0 of w 0 in a basis 𝐵 of 𝐻, and applying an algorithm for solving PrimP in F ( B ) (for example, Algorithm 𝒫 mentioned above, with worst case complexity O ( m 2 r 3 ) ). By Proposition 4.2, this results in a worst case complexity of

O ( k n log * ( k n ) + r k n + m 2 k 3 )

(since the rank of 𝐻 is at most 𝑘).

Recently, Shpilrain [19] gave an algorithm solving PrimP in F ( A ) with constant average case complexity. This constant average case complexity assumes, as we have done so far, that the rank 𝑟 of the ambient free group F ( A ) is fixed. However, we cannot make this assumption anymore since we need to solve PrimP in free subgroups of F ( A ) , whose rank may be as large as 𝑘. We therefore revisit this algorithm in detail in Section 5.1, and we recompute its average case complexity to ascertain its dependency in 𝑟. The average case complexity of the combination of this algorithm with Algorithm M P d (Section 4) is discussed in Section 5.2.

5.1 Shpilrain’s primitivity algorithm

Recall that a word 𝑢 in F ( A ) is cyclically reduced if its last letter is not the inverse of its first letter, that is, if u 2 is reduced. It is clear that any word 𝑢 factors in a unique fashion as u = v w v 1 with 𝑤 cyclically reduced, and we call 𝑤 the cyclic core of 𝑢, written κ ( u ) . It is immediate that 𝑢 is primitive if and only if κ ( u ) is.

If u = x 1 x n is a reduced word of length at least 2, let W ( u ) be the Whitehead graph of 𝑢, namely the simple (undirected) graph on vertex set A ̃ , with an edge from vertex 𝑥 to vertex 𝑦 if there exists 1 i n such that x i x i + 1 = x y 1 or y x 1 (here x n + 1 stands for x 1 ). Observe that W ( u ) can be constructed one edge at a time when reading 𝑢 from left to right, in time O ( | u | ) .

Recall finally that a vertex 𝑝 of a connected graph 𝐺 is a cut vertex if deleting 𝑝 from 𝐺 (and all the edges adjacent to 𝑝) results in a disconnected graph. Whitehead showed the following [22].

Proposition 5.1

Let 𝑢 be a cyclically reduced word of length at least 2 in F ( A ) . If 𝑢 is primitive, then either W ( u ) is disconnected, or W ( u ) admits a cut vertex.

Shpilrain’s Algorithm 𝒮 (slightly modified) is as follows [19]. We let

g ( n ) = n 1 log ( 2 r 1 ) log ( n 4 r 6 ) .

Algorithm 𝒮

On input a reduced word u F ( A ) (together with its length 𝑛),

  1. compute κ ( u ) , the cyclic core of 𝑢, say κ ( u ) = x 1 x h . If h = | κ ( u ) | g ( n ) , go to step (4). Otherwise, let i = 2 , let 𝑊 be the graph with vertex set A ̃ and no edges (so that every vertex is its own connected component) and go to step (2).

  2. Read x i , add the edge ( x i 1 , x i 1 ) to 𝑊 and update the list of connected components of 𝑊. If 𝑊 is connected and has no cut vertex, stop the algorithm: 𝑢 is not primitive in F ( A ) . Otherwise, if i < h , increment 𝑖 by a unit and repeat step (2), and if i = h , go to step (3).

  3. Add to 𝑊 the edge ( x h , x 1 1 ) and update the list of connected components of 𝑊. If 𝑊 is connected and has no cut vertex, stop the algorithm: 𝑢 is not primitive in F ( A ) . Otherwise, go to step (4).

  4. Run Algorithm 𝒫 on κ ( u ) to decide whether 𝑢 is primitive in F ( A ) .

Algorithm 𝒮 certainly solves the Primitivity Problem (using Proposition 5.1) since the graph 𝑊 constructed in steps (2) and (3) is an increasingly larger fragment of W ( κ ( u ) ) . The algorithm stops when either 𝑊 is connected and has no cut vertex, in which case W ( κ ( u ) ) has the same property and 𝑢 is therefore not primitive in F ( A ) , or when Proposition 5.1 has failed to give us an answer and Algorithm 𝒫 has been called to settle the issue.

Shpilrain showed in [19] that the average case complexity of Algorithm 𝒮 is bounded above by a constant, independent of the length of the input word. This constant does however depend on the ambient rank 𝑟, and we specify this dependency in Proposition 5.5 below. Before we state this proposition, we need to record a few results.

First recall that the number R n of reduced words of length 𝑛 in F ( A ) is

2 r ( 2 r 1 ) n 1 .

The number CR n of cyclically reduced words of length 𝑛 satisfies

2 r ( 2 r 1 ) n 2 ( 2 r 2 ) CR n 2 r ( 2 r 1 ) n 1 = R n .

It follows that the probability that a reduced word is not cyclically reduced is at most 1 2 r 2 2 r 1 = 1 2 r 1 (not exactly 1 2 r , as asserted in [19], because the first and last letter of a reduced word are random variables that are close to but not exactly independent from each other).

The following (obvious!) algorithm computes the cyclic core of a word in F ( A ) .

Algorithm C R

On input a reduced word u = a 1 a n of length 𝑛 and as long as n 3 , compare a n with a 1 1 ; if they are equal, delete the first and last letter of 𝑢 and repeat this step; if they are different, return the word 𝑢.

Lemma 5.2

The average case complexity of Algorithm C R is O ( 1 ) , independent of the size 𝑟 of the alphabet.

Proof

Let p n be the probability that a length 𝑛 reduced word is not cyclically reduced. As observed before, p n 1 2 r 1 .

Every step of Algorithm C R compares two letters from A ̃ , and hence takes constant time 𝐶. On input 𝑢, of length 𝑛, Algorithm C R concludes in 1 step (that is, the case where 𝑢 is cyclically reduced) with probability 1 p n , and otherwise repeats its single step, on a length n 2 input. Thus the expected time is bounded above by

( 1 + p n + p n p n 2 + p n p n 2 p n 4 + ) C ( i 0 ( 2 r 1 ) i ) C .

Since i 0 ( 2 r 1 ) i = 2 r 1 2 r 2 3 2 , this concludes the proof. ∎

We will also use the following fact.

Lemma 5.3

The probability that the cyclic core of a length 𝑛 element of F ( A ) has length less than or equal to n 2 is O ( ( 2 r 1 ) ) .

Proof

This is a side product of the proof of Lemma 5.2: as observed there, Algorithm C R concludes in 1 step with probability 1 p n 1 . It concludes in exactly 2 steps with probability p n ( 1 p n 2 ) ( 2 r 1 ) 1 , and it concludes in h + 1 steps with probability p n p n 2 p n 2 h + 2 ( 1 p n 2 h ) ( 2 r 1 ) h . Now, κ ( u ) has length n 2 h if and only if Algorithm C R terminates in h + 1 steps. So | κ ( u ) | n 2 if Algorithm C R terminates in at least + 1 steps, and this happens with probability at most ( 2 r 1 ) i ( 2 r 1 ) i . This quantity is

2 r 1 2 r 2 ( 2 r 1 ) 3 2 ( 2 r 1 ) .

An important observation is that if 𝑢 is a random word in F ( A ) of length 𝑛, then with high probability, W ( u ) is connected and has no cut vertex. More precisely, the following holds. If u = x 1 x n is a reduced word of length at least 2, let W ( u ) be the simple graph with vertex set A ̃ , and with an edge from vertex 𝑥 to vertex 𝑦 if there exists 1 i < n such that x i x i + 1 = x y 1 or y x 1 . Note that this is almost identical to the definition of the Whitehead graph W ( u ) , except that we do not consider the case where i = n . In particular, W ( u ) is a subgraph of W ( u ) , and if W ( u ) is connected and has no cut vertex, then the same property holds for W ( u ) .

Proposition 5.4

Let r 2 and let F = F ( A ) , with | A | = r . There exists a positive number α ( r ) < 1 1 2 r 2 with the following property: the probability for a word 𝑢 of length 𝑛 in F ( A ) that W ( u ) is disconnected, or is connected and has a cut vertex, is Θ ( n k α ( r ) n ) for some k N .

Proof

Let 𝒢 be the set of simple graphs on vertex set A ̃ (that is, undirected loop-free graphs without multiple edges). If G G , let A ( G ) be the A ̃ -automaton with the same vertex set, whose edges are as follows: for every edge of 𝐺 connecting vertices 𝑎 and 𝑏 ( a , b A ̃ ), A ( G ) has a b 1 -labeled edge from state 𝑎 to state b 1 and an a 1 -labeled edge from state 𝑏 to state a 1 . Let also L ( G ) be the set of all words in A ̃ * which label a path in A ( G ) , with no condition on its starting and ending points; in particular, L ( G ) is a regular language. Finally, let M ( G ) be the transition matrix of A ( G ) , that is, the order 2 r matrix whose ( a , b ) -entry is 1 if A ( G ) has an edge from vertex 𝑎 to vertex 𝑏, and 0 otherwise.

For G , G G , say that G G if every edge of 𝐺 is also an edge of G . It is clear that if G G and 𝐺 is connected and has no cut vertex, then the same holds for G . Let G 1 , , G h be the ≤-maximal elements of 𝒢 which are either disconnected, or connected and with a cut vertex. Let λ i be the growth modulus of L ( G i ) ; then the union of the L ( G i ) has growth modulus λ 0 = max 1 i k λ i .

Now let 𝑋 denote the set of reduced words 𝑢 such that W ( u ) is disconnected, or is connected and has a cut vertex. Observe that if u L ( G i ) , then W ( u ) G i , and hence L ( G i ) is contained in 𝑋. That is, 𝑋 contains the union of the L ( G i ) ( i [ 1 , h ] ). Moreover, for every word 𝑢 with first letter 𝑎, we have u a L ( W ( u ) ) so that 𝑋 is contained in the union of the a L ( G i ) ( a A ̃ and i [ 1 , h ] ). It is easily verified that a L ( G i ) and G i have the same growth rate, so both unions have growth rate λ 0 , and hence so does 𝑋.

Thus the number of length 𝑛 words in 𝑋 is asymptotically equivalent to C n k λ 0 n , where C > 0 and k N (see Fact 2.1). As the growth modulus of the language of all reduced words is 2 r 1 and α ( r ) was defined to be equal to λ 0 2 r 1 , it follows that the probability that a length 𝑛 word is in 𝑋 is asymptotically equivalent to C n k α ( r ) n .

In order to conclude the proof, we need to establish an explicit upper bound for α ( r ) , as a function of 𝑟. This is done rather abruptly (following reasoning similar to that in [5]): for each a , b A ̃ with b a , a 1 , let G a , a be obtained from the maximum element of 𝒢 (which has an edge between every pair of distinct vertices in A ̃ ) by deleting the edge between 𝑎 and a 1 , and let G a , b be obtained from the same maximum element by deleting the edge between 𝑎 and 𝑏 (Figure 1). Then every G i satisfies either G i G a , a or G i G a , b for some a , b A ̃ . In particular, L ( G i ) is contained in L ( G a , a ) or L ( G a , b ) , and hence λ i is less than or equal to the growth modulus of L ( G a , a ) or L ( G a , b ) . Since G a , a and G a , b are irreducible and aperiodic, Proposition 2.3 shows that these growth moduli are the leading eigenvalues of M ( G a , a ) or M ( G a , b ) , respectively.

Figure 1 
                        The transition matrices of 
                              
                                 
                                    
                                       A
                                       ⁢
                                       
                                          (
                                          
                                             G
                                             
                                                
                                                   a
                                                   1
                                                
                                                ,
                                                
                                                   a
                                                   1
                                                
                                             
                                          
                                          )
                                       
                                    
                                 
                                 
                                 \mathcal{A}(G_{a_{1},a_{1}})
                              
                            (on the left) and 
                              
                                 
                                    
                                       A
                                       ⁢
                                       
                                          (
                                          
                                             G
                                             
                                                
                                                   a
                                                   1
                                                
                                                ,
                                                
                                                   a
                                                   2
                                                
                                             
                                          
                                          )
                                       
                                    
                                 
                                 
                                 \mathcal{A}(G_{a_{1},a_{2}})
                              
                            (on the right) for 
                              
                                 
                                    
                                       r
                                       =
                                       4
                                    
                                 
                                 
                                 r=4
                              
                           , where the vertex set is 
                              
                                 
                                    
                                       {
                                       
                                          a
                                          1
                                       
                                       ,
                                       
                                          a
                                          
                                             −
                                             1
                                          
                                       
                                       ,
                                       
                                          a
                                          2
                                       
                                       ,
                                       
                                          a
                                          
                                             −
                                             2
                                          
                                       
                                       ,
                                       …
                                       ,
                                       
                                          a
                                          r
                                       
                                       ,
                                       
                                          a
                                          
                                             −
                                             r
                                          
                                       
                                       }
                                    
                                 
                                 
                                 \{a_{1},a_{-1},a_{2},a_{-2},\dots,a_{r},a_{-r}\}
                              
                           , in that order.
Figure 1

The transition matrices of A ( G a 1 , a 1 ) (on the left) and A ( G a 1 , a 2 ) (on the right) for r = 4 , where the vertex set is { a 1 , a 1 , a 2 , a 2 , , a r , a r } , in that order.

It should be clear that these growth moduli do not depend on the choice of a , b A ̃ . Facts A.1 and A.2 from the appendix show that both are at most

( 2 r 1 ) ( 1 1 2 r 2 ) ,

which completes the proof. ∎

Proposition 5.5

Let r 2 . There exists a positive number β ( r ) < 1 1 2 r 2 such that the average case complexity of Algorithm 𝒮 is O ( ( r 1 β ( r ) ) 2 + r 3 ) . In particular, this average case complexity is O ( r 6 ) .

Proof

Let 𝑘 and α ( r ) be given by Proposition 5.4, and let β ( r ) be a number such that α ( r ) < β ( r ) < 1 1 2 r 2 . Note that n k α ( r ) n = O ( β ( r ) n ) .

Step (1) of Algorithm 𝒮 takes constant expected time; see Lemma 5.2. It is a classical result (usually referred to [10]) that connectedness and the presence of a cut vertex in a graph with 𝑉 vertices and 𝐸 edges can be decided in time O ( V + E ) . For the graphs occurring in the algorithm, which are subgraphs of the Whitehead graph W ( u ) , we have

V = 2 r and E 2 r ( 2 r 1 ) ,

so O ( V + E ) = O ( r 2 ) . It follows that each iteration of step (2) takes time O ( r 2 ) since 𝑊 has 2 r vertices, and the same holds for step (3). Finally, step (4) takes time O ( m 2 r 3 ) , where 𝑚 is the length of the input word.

By Lemma 5.3, the probability that step (1) directly leads to step (4), that is, the probability that | κ ( u ) | g ( n ) , is

O ( ( 2 r 1 ) n g ( n ) 2 ) = O ( n 2 r 3 ) .

So the contribution of this configuration to the average case complexity of Algorithm 𝒮 is O ( n 2 r 3 ) O ( g ( n ) 2 r 3 ) = O ( 1 ) .

Let q < | κ ( u ) | and let 𝑝 be the length q 1 prefix of κ ( u ) . Step (2) is iterated at least 𝑞 times if the graph W ( p ) is disconnected, or is connected and has a cut vertex. This happens with probability at most C β ( r ) q 1 for some constant C > 0 .

Thus the expected running time of Algorithm 𝒮 is a big-𝒪 of

q β ( r ) q 1 q r 2 + β ( r ) g ( n ) n 2 r 3 ( r 1 β ( r ) ) 2 + β ( r ) g ( n ) n 2 r 3 .

The inequality above is justified as follows: for | s | < 1 , we have

q q s q 1 = d d s ( q s q ) = d d s ( 1 1 s ) = 1 ( 1 s ) 2 .

Moreover, we have

β ( r ) g ( n ) n 2 < β ( r ) n n 2 < ( 1 1 2 r 2 ) n n 2 .

If 𝑛 is large enough with respect to 𝑟, this quantity is less than 1. More precisely, suppose that n log n > 4 r 2 . Then

log ( β ( r ) g ( n ) n 2 ) < 2 log n + n log ( 1 1 2 r 2 ) < 2 log n n 2 r 2 < 0 .

This concludes the proof that the average case complexity of Algorithm 𝒮 is

O ( ( r 1 α ( r ) ) 2 + r 3 ) .

The last part of the statement follows from the observation that

( r 1 α ( r ) ) 2 < 4 r 6 .

Remark 5.6

The Perron–Frobenius theorem can be invoked to show that the spectral radius of M ( G ) is less than 2 r 1 for each G G that is not the clique on A ̃ . As we saw, we need however an estimate of how much smaller than 2 r 1 these spectral radii are. The method used in the proof of Proposition 5.4 is far from optimal: we estimate the spectral radius of M ( G ) for the graphs obtained from the maximum element of 𝒢 by removing a single edge. Such graphs are far from being disconnected or having a cut vertex. Any upper bound of the spectral radius of the M ( G ) where 𝐺 is disconnected or has a cut vertex would lead to an improvement in the expected running time of Algorithm 𝒮. There is considerable scope for such an improvement.

5.2 The Relative Primitivity Problem

We finally get to the Relative Primitivity Problem, RPrimP: on input a 𝑘-tuple w = ( w 1 , , w k ) of words in F ( A ) and a word w 0 in F ( A ) , along with their lengths, decide whether w 0 belongs to H = w and is primitive in it. As indicated earlier, the idea is essentially to combine an Algorithm M P d , for a fast decision of the Uniform Membership Problem, with Shpilrain’s Algorithm 𝒮, for a fast decision of the Primitivity Problem, carefully distinguishing between the situations where w has good properties (the d ( n ) -ctp for a well-chosen function 𝑑, and the fact that min | w | > 1 2 max | w | ), which will happen with high probability, and where it does not.

More precisely, consider the following algorithm, parametrized by the choice of a non-decreasing function d ( n ) such that d ( n ) < n 2 .

Algorithm R P d

  1. Find out whether w has the d ( n ) -ctp and min | w | > n 2 (this is the first step of Algorithm M P d ). If one of these properties does not hold, go to step (2). If both do, compute Γ d ( n ) ( w ) and go to step (3).

  2. Run Algorithm M P on input ( w 0 , w ) to decide whether w 0 H = w and, if it does, to compute x 0 , the expression of w 0 in a basis 𝐵 of 𝐻. In the latter case, run Algorithm 𝒮 on x 0 in F ( B ) , to decide whether x 0 is primitive in F ( B ) , equivalently, whether w 0 is primitive in 𝐻.

  3. Run steps (2) and (3), the latter iterated, of Algorithm M P d to decide whether w 0 H and, if it does, to compute x 0 , the expression of w 0 in basis w . If w 0 H , run Algorithm 𝒮 on x 0 in the rank 𝑘 free group H = w .

We prove the following theorem.

Theorem 5.7

Let d ( n ) be a non-decreasing function of 𝑛 such that d ( n ) < n 2 . Then Algorithm R P d solves RPrimP.

Let r = | A | 2 , 0 < δ < 1 4 , 0 < β < 1 2 2 δ and γ = ( 2 r 1 ) 2 β 1 4 δ 1 < 1 . Suppose that d ( n ) δ n and k ( n ) ( 2 r 1 ) β n for every 𝑛. If we restrict the input space to pairs of the form ( w 0 , w ) , where max | w | = n and w is a tuple of length k ( n ) , then the average case complexity of Algorithm R P d is a big-𝒪 of

k ( n ) d ( n ) + k ( n ) 2 ( 2 r 1 ) d ( n / 2 ) ( k ( n ) n log * ( k ( n ) n ) + m + k ( n ) 6 ) + γ m k ( n ) 6 ,

where m = | w 0 | .

If the input ( w 0 , w ) of RPrimP is limited to those pairs where w has the d ( n ) -ctp and min | w | > n 2 , then the average case complexity of Algorithm R P d is

O ( k ( n ) d ( n ) + k ( n ) 6 γ m ) .

Proof

It is clear that Algorithm R P d solves RPrimP.

Step (1) of Algorithm R P d takes time O ( k ( n ) d ( n ) ) ; see the analysis of Algorithm M P d in the proof of Theorem 4.4.

The algorithm moves to step (2) with probability O ( k ( n ) 2 ( 2 r 1 ) d ( n / 2 ) ) , and to step (3) with the complementary probability; see Propositions 3.4 and 3.5.

In case we reach step (2), as in the proof of Theorem 4.4, Algorithm M P takes time O ( k ( n ) n log * ( k ( n ) n ) + r k ( n ) n + m ) = O ( k ( n ) n log * ( k ( n ) n ) + m ) . If Algorithm M P concludes that w 0 H , then it also outputs a word x 0 F ( B ) , for a certain basis 𝐵 of 𝐻; moreover, | B | k and | x 0 | m . Running Algorithm 𝒮 takes time O ( k ( n ) 6 ) on average (see Proposition 5.5).

Otherwise, we reach step (3); we are in the situation where w has the d ( n ) -ctp and min | w | > n 2 . In particular, the expected running time of step (2) and all iterations of step (3) of Algorithm M P d is O ( k ( n ) d ( n ) ) ; see the proof of Theorem 4.4 (here is where we use the hypothesis on k ( n ) and d ( n ) ).

Moreover, the growth modulus γ H of 𝐻 is at most

( 2 k ( n ) 1 ) 2 n 4 d ( n )

by Proposition 3.3. Then, for every 𝑛, we have that

γ H ( 2 k ( n ) 1 ) 2 n 4 d ( n ) < ( 2 k ( n ) ) 2 n 4 d ( n ) ( 2 ( 2 r 1 ) β n ) 2 ( 1 4 δ ) n = 2 2 ( 1 4 δ ) n ( 2 r 1 ) 2 β 1 4 δ .

Taking the limit, we get

γ H ( 2 r 1 ) 2 β 1 4 δ < 2 r 1

since, by hypothesis, 2 β < 1 4 δ . It follows that the probability that w 0 H is O ( ( γ H 2 r 1 ) m ) , and so O ( γ m ) . If indeed w 0 H , we run Algorithm 𝒮 on the word x 0 F ( X ) , in expected time O ( k ( n ) 6 ) ; see Proposition 5.5.

Thus the expected running time of this algorithm is bounded above by a big-𝒪 of

k ( n ) d ( n ) + k ( n ) 2 ( 2 r 1 ) d ( n / 2 ) ( k ( n ) n log * ( k ( n ) n ) + m + k ( n ) 6 ) + γ m k ( n ) 6 ,

giving the stated asymptotic estimate.

Finally, if the input ( w 0 , w ) is such that w has the d ( n ) -ctp and min | w | > n 2 , then after step (1), we go directly to step (3), and the expected running time of the algorithm is O ( k ( n ) d ( n ) + γ m k ( n ) 6 ) . ∎

As for the Uniform Membership Problem, this gives us upper bounds of the average complexity of the RPrimP for interesting functions k ( n ) .

Corollary 5.8

The Relative Primitivity Problem (RPrimP) for F ( A ) , with input a k ( n ) -tuple of words of length at most 𝑛, and an additional word of length 𝑚, can be solved in expected time C ( n , m ) as follows (where r = | A | is taken to be constant).

  1. If 𝑘 is constant, then C ( n , m ) = O ( log n + m n log ( 2 r 1 ) ) .

  2. If θ > 0 and k ( n ) = n θ ,

    C ( n , m ) = O ( n θ + δ + n 2 θ ( 2 r 1 ) n δ m + n 6 θ ( 2 2 r 1 ) m )

    for any 0 < δ < 1 .

  3. If 0 < β < 1 58 and k ( n ) = ( 2 r 1 ) β n ,

    C ( n , m ) = O ( n ( 2 r 1 ) β n + ( 2 r 1 ) 5 β n m + ( 2 r 1 ) 6 β n 1 58 β 1 56 β m ) .

Proof

For the case where 𝑘 is constant, apply Theorem 5.7 with d ( n ) = log ( 2 n ) and arbitrary valid values of δ and β . This shows (1).

For k ( n ) = n θ , choose d ( n ) = ( 2 n ) δ . Then the hypotheses of Theorem 5.7 are satisfied with δ = 1 8 and any β such that 0 < β < 1 4 ; the quantity 𝛾 is then γ = ( 2 r 1 ) 4 β 1 . Choosing β such that ( 2 r 1 ) 4 β = 2 yields the stated result. This shows (2).

Finally, suppose that k ( n ) = ( 2 r 1 ) β n with 0 < β < 1 58 . Let δ = 14 β and d ( n ) = δ n . Note that δ < 1 4 β 2 . Then the hypotheses of Theorem 5.7 are satisfied with β = β and δ = δ . The asymptotic upper bound from that theorem now reads as a big-𝒪 of

( 2 r 1 ) β n n + ( 2 r 1 ) ( 3 β δ 2 ) n log * n + ( 2 r 1 ) ( 2 β δ 2 ) n m + ( 2 r 1 ) ( 8 β δ 2 ) n + ( 2 r 1 ) 6 β n γ m ,

where γ = ( 2 r 1 ) 1 2 β 4 δ 1 4 δ . Since δ = 14 β , we have γ = ( 2 r 1 ) 1 58 β 1 56 β , the second and fourth summands above are dominated by the first one, and the last summand becomes

( 2 r 1 ) 6 β n 1 58 β 1 56 β m .

So our upper bound is a big-𝒪 of

( 2 r 1 ) β n n + ( 2 r 1 ) 5 β n m + ( 2 r 1 ) 6 β n 1 58 β 1 56 β m ,

as stated. This shows (3), concluding the proof. ∎

Award Identifier / Grant number: PID2021-126851NB-100

Funding statement: The first and second named authors acknowledge partial support from the Spanish Agencia Estatal de Investigación, through grant PID2021-126851NB-100 (AEI/FEDER, UE).

A Appendix

Fact A.1

The growth modulus of L ( G a , a ) is

1 2 ( 2 r 3 + ( 2 r + 1 ) 2 8 ) = ( 2 r 1 ) ( 1 1 2 r 2 3 8 r 4 + O ( r 5 ) ) .

Proof

Let e i ( r i r , i 0 ) be the column vectors with coordinate at vertex a i equal to 1 and all other coordinates equal to 0, the standard basis of the dimension 2 r vector space. Let M a , a be the transition matrix of A ( G a , a ) ; see Figure 1.

It is an elementary verification that e 1 e 1 is in the kernel of M a , a , that each e i e i ( 2 i r ) is an eigenvector for the eigenvalue 1, and that each e i + e i e r e r ( 2 i < r ) is an eigenvector for the eigenvalue −1. These 2 r 2 vectors, together with v 1 = e 1 + e 1 and v 2 = i 2 ( e i + e i ) , form a basis of the full space.

Moreover, M a , a v 1 = 2 v 2 and M a , a v 2 = ( 2 r 2 ) v 1 + ( 2 r 3 ) v 2 . Therefore, the other eigenvalues of M a , a are the eigenvalues of ( 0 2 r 2 2 2 r 3 ) , and the result follows. ∎

Fact A.2

The growth modulus of L ( G a , b ) is the maximum root of

X 3 ( 2 r 1 ) X 2 + 4 ( r 1 ) ,

and it is bounded above by ( 2 r 1 ) ( 1 1 2 r 2 ) .

Proof

Let e i ( r i r , i 0 ) be as in the proof of Fact A.1 and let M a , b be the transition matrix of A ( G a , b ) ; see Figure 1.

One verifies easily that e 1 e 2 is in the kernel of M a , b and that

M a , b ( e 1 e 2 ) = e 2 e 1 .

It also holds that each e i e i ( 3 i r ) is an eigenvector for eigenvalue 1, and each e i + e i e r e r ( 3 i < r ) is an eigenvector for eigenvalue −1.

These 2 r 3 vectors are linearly independent, and the vectors v 1 = e 1 + e 2 , v 2 = e 1 + e 2 and v 3 = i = 3 r ( e i + e i ) complete them to a basis of the full space.

In addition, we have M a , b v 1 = 2 ( v 1 + v 3 ) , M a , b v 2 = v 1 + 2 v 2 + 2 v 3 and M a , b v 3 = ( 2 r 4 ) v 1 + ( 2 r 4 ) v 2 + ( 2 r 5 ) v 3 .

Thus, in this basis (suitably ordered), the matrix of the linear transformation M a , b consists of the following diagonal blocks:

( 0 1 0 0 ) , ( 1 ) ( r 2 times ) , ( 1 ) ( r 3 times ) , ( 2 1 2 r 4 0 2 2 r 4 2 2 2 r 5 ) .

In particular, the remaining eigenvalues are the roots of the characteristic polynomial of that ( 3 , 3 ) -matrix, namely P ( X ) = X 3 ( 2 r 1 ) X 2 + 4 ( r 1 ) . We note that the local extrema of P ( X ) are at 0 and 2 3 ( 2 r 1 ) , P ( 0 ) = 4 ( r 1 ) > 0 and P ( r ) = r 3 + r 2 + 4 r 4 , which is negative for all r 2 . Therefore, P ( X ) has three real roots.

Since P ( 2 r 1 ) = 4 ( r 1 ) > 0 , the leading eigenvalue of M a , b sits between 2 3 ( 2 r 1 ) and 2 r 1 . For a closer estimate, let δ = ( 2 r 1 ) ( 1 1 2 r 2 ) . Then

P ( δ ) = 4 ( r 1 ) + ( 2 r 1 ) 3 ( 1 1 2 r 2 ) 2 ( 1 1 2 r 2 1 ) = 4 ( r 1 ) ( 2 r 2 1 ) 2 8 r 6 ( 2 r 1 ) 3 = 16 r 6 + 8 r 5 44 r 4 + 16 r 3 + 8 r 2 6 r + 1 8 r 6 ,

which is positive when r 2 . Thus the leading eigenvalue of M a , b , which is the growth modulus of L ( G a , b ) , is at most δ = ( 2 r 1 ) ( 1 1 2 r 2 ) , as stated. ∎

Remark A.3

Facts A.1 and A.2, while mathematically elementary, would have been very difficult to establish without the help of a versatile computer algebra system. The authors are grateful to the developers of SageMath [24].

Acknowledgements

The authors gratefully acknowledge the referee’s thorough reading and insightful remarks, which helped correct inaccuracies and make the paper more readable.

  1. Communicated by: Rachel Skipper

References

[1] G. N. Arzhantseva and A. Y. Ol’shanskiĭ, Generality of the class of groups in which subgroups with a lesser number of generators are free, Mat. Zametki 59 (1996), no. 4, 489–496, 638. 10.1007/BF02308683Search in Google Scholar

[2] F. Bassino, C. Nicaud and P. Weil, Generic properties of subgroups of free groups and finite presentations, Algebra and Computer Science, Contemp. Math. 677, American Mathematical Society, Providence (2016), 1–43. 10.1090/conm/677/13619Search in Google Scholar

[3] F. Bassino, C. Nicaud and P. Weil, Random presentations and random subgroups: A survey, Complexity and Randomness in Group Theory – GAGTA Book 1, De Gruyter, Berlin (2020), 45–76. 10.1515/9783110667028-002Search in Google Scholar

[4] J. L. Bentley and A. C. C. Yao, An almost optimal algorithm for unbounded searching, Inform. Process. Lett. 5 (1976), no. 3, 82–87. 10.1016/0020-0190(76)90071-5Search in Google Scholar

[5] J. Burillo and E. Ventura, Counting primitive elements in free groups, Geom. Dedicata 93 (2002), 143–162. 10.1023/A:1020391000992Search in Google Scholar

[6] N. Chomsky and M. P. Schützenberger, The algebraic theory of context-free languages, Computer Programming and Formal Systems, North-Holland, Amsterdam (1963), 118–161. 10.1016/S0049-237X(08)72023-8Search in Google Scholar

[7] M. Dehn, Über unendliche diskontinuierliche Gruppen, Math. Ann. 71 (1911), no. 1, 116–144. 10.1007/BF01456932Search in Google Scholar

[8] J. Delgado and E. Ventura, A list of applications of Stallings automata, Trans. Comb. 11 (2022), no. 3, 181–235. Search in Google Scholar

[9] P. Flajolet and R. Sedgewick, Analytic Combinatorics, Cambridge University, Cambridge, 2009. 10.1017/CBO9780511801655Search in Google Scholar

[10] J. E. Hopcroft and R. E. Tarjan, Efficient algorithms for graph manipulation [H] (algorithm 447), Commun. ACM 16 (1973), no. 6, 372–378. 10.1145/362248.362272Search in Google Scholar

[11] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University, Cambridge, 1990. Search in Google Scholar

[12] I. Kapovich and A. Myasnikov, Stallings foldings and subgroups of free groups, J. Algebra 248 (2002), no. 2, 608–668. 10.1006/jabr.2001.9033Search in Google Scholar

[13] S. Margolis, M. Sapir and P. Weil, Closed subgroups in pro-𝐕 topologies and the extension problem for inverse automata, Internat. J. Algebra Comput. 11 (2001), no. 4, 405–445. 10.1142/S0218196701000498Search in Google Scholar

[14] A. Miasnikov, E. Ventura and P. Weil, Algebraic extensions in free groups, Geometric Group Theory, Trends Math., Birkhäuser, Basel (2007), 225–253. 10.1007/978-3-7643-8412-8_12Search in Google Scholar

[15] J. Nielsen, Die Isomorphismen der allgemeinen, unendlichen Gruppe mit zwei Erzeugenden, Math. Ann. 78 (1917), no. 1, 385–397. 10.1007/BF01457113Search in Google Scholar

[16] P. Novikov, On the algorithmic unsolvability of the word problem in group theory, Trudy Mat. Inst. Steklov. 44 (1955), 1–43. Search in Google Scholar

[17] A. Roig, E. Ventura and P. Weil, On the complexity of the Whitehead minimization problem, Internat. J. Algebra Comput. 17 (2007), no. 8, 1611–1634. 10.1142/S0218196707004244Search in Google Scholar

[18] J.-P. Serre, Trees, Springer, Berlin, 1980. 10.1007/978-3-642-61856-7Search in Google Scholar

[19] V. Shpilrain, Average-case complexity of the Whitehead problem for free groups, Comm. Algebra 51 (2023), no. 2, 799–806. 10.1080/00927872.2022.2113791Search in Google Scholar

[20] J. R. Stallings, Topology of finite graphs, Invent. Math. 71 (1983), no. 3, 551–565. 10.1007/BF02095993Search in Google Scholar

[21] N. W. M. Touikan, A fast algorithm for Stallings’ folding process, Internat. J. Algebra Comput. 16 (2006), no. 6, 1031–1045. 10.1142/S0218196706003396Search in Google Scholar

[22] J. H. C. Whitehead, On certain cets of elements in a free group, Proc. Londo. Math. Soc. (2) 41 (1936), no. 1, 48–56. 10.1112/plms/s2-41.1.48Search in Google Scholar

[23] J. H. C. Whitehead, On equivalent sets of elements in a free group, Ann. of Math. (2) 37 (1936), no. 4, 782–800. 10.2307/1968618Search in Google Scholar

[24] The Sage Developers, SageMath, the Sage Mathematics Software System (Version 9.6), 2022, https://www.sagemath.org. Search in Google Scholar

Received: 2023-03-26
Revised: 2023-11-25
Published Online: 2024-02-14
Published in Print: 2024-09-01

© 2024 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 30.1.2026 from https://www.degruyterbrill.com/document/doi/10.1515/jgth-2023-0050/html
Scroll to top button