Home Mathematics Fibers of automorphic word maps and an application to composition factors
Article Publicly Available

Fibers of automorphic word maps and an application to composition factors

  • Alexander Bors EMAIL logo
Published/Copyright: July 20, 2017

Abstract

In this paper, we study the fibers of “automorphic word maps”, a certain generalization of word maps, on finite groups and on nonabelian finite simple groups in particular. As an application, we derive a structural restriction on finite groups G where, for some fixed nonempty reduced word w in d variables and some fixed ρ(0,1], the word map wG on G has a fiber of size at least ρ|G|d: No sufficiently large alternating group and no (classical) simple group of Lie type of sufficiently high rank can occur as a composition factor of such a group G.

1 Introduction

1.1 Motivation and main result

Word maps on groups have been studied intensely in recent years, resulting in substantial progress on interesting questions and a beautiful theory using tools from various areas such as representation theory and algebraic geometry; interested readers are referred to the survey article [5].

Recall that a (reduced) word w in d variables X1,,Xd is an element of the free group F(X1,,Xd). Each such word gives, for each group G, rise to a word map wG:GdG induced by substitution. Studying the fibers of wG means studying the solution sets in Gd to equations of the form w=w(X1,,Xd)=g for gG. By Larsen and Shalev’s result [3, Theorem 1.1], for fixed w, the maximum number of solutions to such an equation in a nonabelian finite simple group S is in o(|S|d) as |S|. Hence for each fixed number ρ(0,1], for only finitely many nonabelian finite simple groups S, wS has a fiber of size at least ρ|S|d.

Based on this, it is natural to ask what one can say more generally about the nonabelian composition factors of a finite group G where the word map wG has a fiber of size at least ρ|G|d. In order to be able to use [3, Theorem 1.1] for this, it would be useful if one could somehow relate the maximum fiber size of wG with the maximum fiber sizes of the word maps associated with w over the composition factors of G. For example, it would be nice to have an inequality of the form Πw(G)Πw(N)Πw(G/N) for all finite groups G and all normal subgroups N of G, where Πw denotes the function that maps each finite group G to the maximum fiber size of wG. Unfortunately, this is not the case, even if we assume that N is characteristic in G; consider, for example, G=D2o, the dihedral group of order 2o, for some odd integer o3, N the unique cyclic subgroup of index 2 in G, and w=X12.

In this paper, we will describe a way to circumvent these difficulties and provide some strong restrictions on possible composition factors of a finite group G such that Πw(G)ρ|G|d in the form of Theorem 1.1.2 below. First, we introduce some constants:

Notation 1.1.1.

Let w be a reduced word of length l1 in d distinct variables. We introduce the following constant, depending only on w:

M=M(d,l):=k=02l+2(2l(d+1))k=(2l(d+1))2l+3-12l(d+1)-1.

Furthermore, we set M:=M(l,l).

Our main result is the following (as usual, the “untwisted Lie rank” of a Lie-type group is the Lie rank of the corresponding untwisted group):

Theorem 1.1.2.

Let w be a reduced word of length l1 in d distinct variables. Then for all ρ(0,1] and all finite groups G such that the word map wG has a fiber of size at least ρ|G|d, the following hold:

  1. No alternating group of order larger than

    max{16l16e16Ml-2!,ρ-16M}

    is a composition factor of G.

  2. No (classical) simple group of Lie type of untwisted Lie rank larger than

    max{72(l+1)2l2,72(l+1)2l2log2(ρ-1)}

    is a composition factor of G.

In other words, the list of potential composition factors for such a group G consists of cyclic groups of prime order, finitely many alternating groups, the sporadic groups, and all simple Lie-type groups of bounded rank.

1.2 Main ideas and overview of the paper

The main idea for proving Theorem 1.1.2 is to make up for the above mentioned “flaw” of the function Πw by replacing it by an evaluation-wise larger function 𝔓w which satisfies the inequality 𝔓w(G)𝔓w(N)𝔓w(G/N) at least when N is characteristic in G and study 𝔓w instead. To this end, we generalize the notion of a word map in a certain way.

Let us first fix some notation. For a fixed reduced word w in the d variables X1,,Xd and of length l, write

w=x1ϵ1xlϵl,

where x1,,xl{X1,,Xd} and ϵi=±1. Denote by ι the unique function {1,,l}{1,,d} such that for i=1,,l, xi=Xι(i). Thus for each group G, the word map wG is just the map GdG sending

(g1,,gd)gι(1)ϵ1gι(l)ϵl.

Definition 1.2.1.

We introduce the following terminology and notation:

  1. With notation as above, let G be a group, and let α1,,αl be automorphisms of G. The automorphic word map wG(α1,,αl) is the map GdG sending (g1,,gd)α1(gι(1))ϵ1αl(gι(l))ϵl.

  2. By 𝔓w, we denote the function that maps each finite group G to the maximum size of a fiber of one of the automorphic word maps wG(α1,,αl), with α1,,αl automorphisms of G.

Hence wG(α1,,αl) is like wG, except that in the i-th factor of the l factor product as which the evaluation wG(g1,,gd) is defined, we additionally apply αi, one of l automorphisms fixed beforehand. In particular, wG(id,,id)=wG.

The approach of studying fibers of automorphic word maps will actually allow us to prove the following stronger form of Theorem 1.1.2:

Theorem 1.2.2.

Let w be a reduced word of length l1 in d distinct variables and M=M(w) as in Notation 1.1.1. Then for all ρ(0,1] and all finite groups G with Pw(G)ρ|G|d, the following hold:

  1. No alternating group of order larger than

    max{16l16e16Ml-2!,ρ-16M}

    is a composition factor of G.

  2. No (classical) simple group of Lie type of untwisted Lie rank larger than

    max{72(l+1)2l2,72(l+1)2l2log2(ρ-1)}

    is a composition factor of G.

We now give an overview of the rest of this paper:

  1. In Section 2, we prove our main lemma, Lemma 2.1, which includes the inequality 𝔓w(G)𝔓w(N)𝔓w(G/N) for characteristic subgroups N of G. It also includes the observation that the element of G having the largest fiber size under any automorphic word map on G is the identity element of G.

  2. Having gained a basic understanding of automorphic word maps in Section 2, the next goal is to extend, as far as necessary, Larsen and Shalev’s result [3, Theorem 1.1] on fibers of word maps on nonabelian finite simple groups mentioned above to fibers of automorphic word maps. This will be done in Section 3, see Theorem 3.1.2.

  3. Section 4 consists of the proof of Theorem 1.2.2 based on the results developed so far.

  4. Finally, in Section 5, we give some concluding remarks concerning further extensions of Larsen and Shalev’s techniques to automorphic word maps and an interesting consequence thereof.

1.3 Notation

We denote by the set of natural numbers (including 0) and by + the set of positive integers. Euler’s constant is denoted by e, which is to be distinguished from the variable e. The image and preimage of a set M under a function f are denoted by f[M] and f-1[M], respectively. When fi:XiYi for i=1,,n, then we denote by f1××fn the product of the maps fi, i.e., the map

i=1nXii=1nYi,(x1,,xn)(f1(x1),,fn(xn)).

The n-fold product of a map f with itself is denoted by f(n). These last two notations will be used in the proofs of Lemma 4.4 and of the implication “Conjecture 5.2 Conjecture 5.3” in Section 5.

For a group G and an element gG, we denote by conj(g) the conjugation by g on G, i.e., the inner automorphism of G of the form xgxg-1. The automorphism group of G is denoted by Aut(G), and the inner automorphism group of G by Inn(G). For a finite set X, we denote by 𝒮X the symmetric group on X; for a positive integer n, 𝒮n and 𝒜n denote the symmetric and alternating group on {1,,n}, respectively.

For a prime power q, the finite field with q elements is denoted by 𝔽q. For n+ and a prime power q, Matn(q) denotes the ring of (n×n)-matrices over 𝔽q. For a vector space Δ over some field F, we denote by EndF(Δ) the endomorphism ring of Δ, i.e., the ring of F-linear maps ΔΔ.

At some points in our arguments, we will not consider all possible automorphic word maps wG(α1,,αl) over some finite group G, but only those where the αi are from a certain subset of Aut(G). Also, we sometimes want to talk about the maximum fiber size of a particular element of G under an automorphic word map or about the proportion of a fiber of an (automorphic) word map associated with w within the entire argument set Gd, rather than the actual size of the fiber. We therefore introduce the following notation that supplements the notation already introduced:

Notation 1.3.1.

Let G be a finite group, w a reduced word of length l in d distinct variables, AAut(G).

  1. We set πw(G):=Πw(G)/|G|d and 𝔭w(G):=𝔓w(G)/|G|d. Note that we always have πw(G),𝔭w(G)(0,1].

  2. We denote by 𝔓w(A)(G,g) the maximum size of the fiber of g under an automorphic word map of the form wG(α1,,αl), where αiA for i=1,,l, and we set

    𝔓w(A)(G):=maxgG𝔓w(A)(G,g)

    (so that 𝔓w(Aut(G))(G)=𝔓w(G)).

  3. Moreover, we set

    𝔭w(A)(G,g):=𝔓w(A)(G,g)|G|d

    and

    𝔭w(A)(G):=𝔓w(A)(G)|G|d.

2 Basic bounds for automorphic word maps

In this section, we prove the following lemma containing some basic bounds on fiber sizes of automorphic word maps:

Lemma 2.1.

Let w be a reduced word, G a finite group, A a subgroup of Aut(G) containing Inn(G). Furthermore, let N be a characteristic subgroup of G, and denote by

  1. ind(A) the subgroup of Aut(G/N) consisting of all automorphisms of G/N induced by some automorphism from A,

  2. res(A) the subgroup of Aut(N) consisting of all restrictions of automorphisms from A to N,

  3. π:GG/N the canonical projection.

Then the following hold:

  1. For all gG,

    𝔓w(A)(G,g)𝔓w(ind(A))(G/N,π(g))𝔓w(res(A))(N,1),

    or in terms of proportions,

    𝔭w(A)(G,g)𝔭w(ind(A))(G/N,π(g))𝔭w(res(A))(N,1).
  2. For all gG,

    𝔓w(A)(G,g)𝔓w(A)(G,1),

    or in terms of proportions,

    𝔭w(A)(G,g)𝔭w(A)(G,1).

    Hence 𝔓w(A)(G,1)=𝔓w(A)(G).

  3. We have

    𝔓w(A)(G)𝔓w(ind(A))(G/N)𝔓w(res(A))(N),

    or in terms of proportions,

    𝔭w(A)(G)𝔭w(res(A))(G/N)𝔭w(ind(A))(N).

Proof.

(1) As before, we write w=x1ϵ1xlϵl with x1,,xl{X1,,Xd}, ϵi{±1} and ι:{1,,l}{1,,d} such that xi=Xι(i). Furthermore, fix an l-tuple (α1,,αl) of elements of A such that the size of the fiber Φ of g under the map wG(α1,,αl) equals 𝔓w(A)(G,g). For i=1,,l, denote by αi~ the automorphism of G/N induced by αi.

We will establish the inequality by a coset-wise counting argument. More precisely, we will show the following two assertions, which together imply the inequality:

  1. The number of cosets of Nd in Gd having nonempty intersection with Φ is at most 𝔓w(ind(A))(G/N,π(g)).

  2. Φ intersects each coset of Nd in Gd in at most 𝔓w(res(A))(N,1) many elements.

For the first assertion, let (g1,,gd)Φ. In other words,

(2.1)α1(gι(1))ϵ1αl(gι(l))ϵl=g.

Applying π to both sides of formula (2.1) yields

α1~(π(gι(1)))αl~(π(gι(l)))=π(g),

and thus that (π(g1),,π(gd)) lies in the fiber of π(g) under wG/N(α1~,,αl~). The assertion follows immediately from this.

For the second assertion, fix a coset C of Nd in Gd, say C=Nd(g1,,gd). We want to show that |CΦ|𝔓w(res(A))(N,1). Of course, we may assume that CΦ is nonempty, and without loss of generality even that the coset representative (g1,,gd) which we fixed lies in Φ. Hence formula (2.1) holds. We now characterize those (n1,,nd)Nd such that wG(α1,,αl)(n1g1,,ndgd)=g as well, i.e., such that

(2.2)g=α1(nι(1)gι(1))ϵ1αl(nι(l)gι(l))ϵl=t1tl,

where

ti={αi(nι(i))ϵiαi(gι(i))ϵi,if ϵi=+1,αi(gι(i))ϵiαi(nι(i))ϵi,if ϵi=-1.

Note that under the assumed formula (2.1), formula (2.2) is equivalent to the following:

(2.3)1=gg-1=t1t2tlαl(gι(l))-ϵlα2(gι(2))-ϵ2α1(gι(1))-ϵ1.

We now transform the product expression on the right-hand side of formula (2.3) without changing its value by removing, step by the step for i=l,l-1,,1, the factors αi(gι(i))ϵi and αi(gι(i))-ϵi (each of which occurs precisely once in the expression) and applying conj(αi(gι(i))ϵi) to each of the factors between them. This results in an expression of the form wN(β1,,βl)(n1,,nd), where each βi is the restriction to N of an element of A, namely of the composition of some inner automorphism of G with αi.

(2) This follows by setting N:=G in point (1) of this lemma.

(3) By points (1) and (2), we have

𝔓w(A)(G)=𝔓w(A)(G,1)𝔓w(ind(A))(G/N,π(1))𝔓w(res(A))(N,1)
=𝔓w(ind(A))(G/N)𝔓w(res(A))(N),

using the fact that ind(A) resp. res(A) contains Inn(G/N) resp. Inn(N). ∎

3 On fibers of automorphic word maps on nonabelian finite simple groups

3.1 Larsen and Shalev’s result and the main result of this section

The following theorem is an equivalent reformulation of [3, Theorem 1.1]:

Theorem 3.1.1.

For each nonempty and reduced word w in d distinct variables, there exist constants N(w),η(w)>0 such that for all nonabelian finite simple groups S with |S|N(w), the inequality Πw(S)|S|d-η(w) holds.

The proof of this theorem in [3] is split into three parts (note that the sporadic groups can be ignored here, as the fiber size bound only needs to be shown for large enough S):

  1. First, the bound is established for large enough alternating groups by means of a certain combinatorial construction.

  2. Next, the bound is established for all simple Lie-type groups of sufficiently high rank, where the lower bound on the rank is so large that only classical groups need to be considered in this case. As Larsen and Shalev say themselves, the argument is conceptually similar to the one for alternating groups.

  3. Finally, the simple Lie-type groups of bounded rank are treated by means of an argument using results of algebraic geometry.

It turns out that Larsen and Shalev’s arguments in the first two cases can be modified to prove the following, which is the main result of this section:

Theorem 3.1.2.

Let w be a reduced word of length l1 in d distinct variables, and M=M(w) as in Notation 1.1.1. Then the following hold:

  1. For all n+ with n256l16e16Md-2, we have

    𝔓w(𝒜n)|𝒜n|d-1/(16M).
  2. For all simple Lie-type groups S of untwisted Lie rank at least 72(d+1)2l2, we have

    𝔓w(S)|S|d-1/(72(d+1)l2).

Whether such bounds can also be established for the simple Lie-type groups of “small” rank is open; see Section 5 for some more remarks on this.

3.2 Reduction of Theorem 3.1.2 to Theorem 3.2.6

Similarly to [3], the main part of the argument for Theorem 3.1.2 will not provide upper bounds on 𝔓w(S) for the simple groups S in question directly, but on 𝔓w(A)(G), where G is a finite group “closely related with S” and A a certain subgroup of Aut(G). That we can do this without loss of generality is justified by the following lemma, a modification of [3, Lemma 2.1], which served the same purpose:

Lemma 3.2.1.

Let w be a nonempty reduced word in d distinct variables, G and H infinite classes of finite groups, N,η>0. Assume that for each HH with |H|N, there is associated a subgroup A(H)Aut(H) such that

𝔓w(A(H))(H)|H|d-η

(and note that this implies ηd). Set ϵ:=η/(2(1+d-η))>0. Finally, assume that there exists C>0 such that for all GG with |G|C, the following exist:

  1. an H such that |H||G|1+ϵ,

  2. characteristic subgroups K and L of H with KL such that GL/K (we say that G is a characteristic section of H) and such that every automorphism of G can be induced by the restriction to L of a suitable automorphism of H from A(H).

Then the following holds: For all GG with |G|max{N,C},

𝔓w(G)|G|d-η/2.

Proof.

Let G𝒢 with |G|max{N,C}. Fix H with |H||G|1+ϵ containing characteristic subgroups K and L as described in the assumptions. We assume without loss of generality that G=L/K (not just isomorphic). Note that we have in particular that |H||G|N, so that 𝔓w(A(H))(H)|H|d-η by assumption.

We want to bound the fiber sizes of automorphic word maps over G. To this end, fix automorphisms α1~,,αl~Aut(G) and gG such that the fiber size of wG(α1~,,αl~) equals 𝔓w(G), fix hL projecting onto gG=L/K, and fix α1,,αlA(H)Aut(H) such that for i=1,,l, (αi)L induces αi~ on G.

Since each fiber of the map wH(α1,,αl), and thus in particular each fiber of the map wL((α1)L,,(αl)L), has size at most |H|d-η, and since the fiber of g under the map wG(α1~,,αl~) can be expressed as the image under the canonical projection LG of a disjoint union of at most |G|ϵ many fibers of wL((α1)L,,(αl)L), we get

𝔓w(G)=|(wG(α1~,,αl~))-1[{g}]||G|ϵ|H|d-η|G|ϵ+(1+ϵ)(d-η)=|G|d-η/2,

where the last equality is by the definition of ϵ. ∎

Lemmata 3.2.3 and 3.2.5 below will allow us to reduce the proof of Theorem 3.1.2 to the proof of a theorem concerning fibers of automorphic word maps in slightly different classes of groups, Theorem 3.2.6. For example, for the alternating groups, these “closely related” groups will be just the symmetric groups. To make the formulations of the lemmata shorter, let us first introduce the following terminology:

Definition 3.2.2.

Let w be a nonempty reduced word in d distinct variables, and let N,η>0. Furthermore, let 𝒢 be a class of finite groups and A a function that maps each G𝒢 to a subgroup A(G)Aut(G). We say that 𝒢 is (A,N,η)-nice for w, or that (A,N,η) is a niceness tuple of 𝒢 for w, if and only if for all G𝒢 with |G|N, we have 𝔓w(A(G))(G)|G|d-η.

The following lemma allows us to reduce Theorem 3.1.2 (1) to the study of automorphic word map fibers in symmetric groups:

Lemma 3.2.3.

Let w be a nonempty reduced word in d distinct variables. Further, assume that for some NN+ and some η>0, the class of finite symmetric groups is (Aut,N,η)-nice for w. Then the class of finite alternating groups is (Aut,max{N,22(1+d-η)/η+1},η/2)-nice for w.

Proof.

Set ϵ:=η/(2(1+d-η)). For proving the lemma, we want to apply Lemma 3.2.1 with the class of finite symmetric groups and 𝒢 the class of finite alternating groups. For all n+, every automorphism of 𝒜n extends to an automorphism of 𝒮n; hence it suffices to prove that for C:=21+2(1+d-η)/η=21+1/ϵ, |𝒜n|C implies |𝒮n||𝒜n|1+ϵ, which is elementary. ∎

As for the classical groups of Lie type X(q) with which we are concerned in Theorem 3.1.2 (2), the groups “closely related” with them which we will study are, just as in [3, beginning of Section 3], the isometry groups of trivial, perfect symmetric, perfect anti-symmetric or perfect Hermitian pairings (depending on the case) of a vector space over either the field 𝔽q or (only in the Hermitian case) its degree 2 extension 𝔽q2. Set E:=𝔽q, and moreover, set F:=E except in the Hermitian case, where F:=𝔽q2.

In the notation of Kleidman and Liebeck’s book [2, Section 2.1], the classical simple Lie-type group X(q) is Ω¯ and is the projective version of a subgroup Ω of the associated isometry group, which is denoted by I. Note that I, in turn, is contained as a normal subgroup in some group A which (by its conjugation action on I) may be viewed as a subgroup of Aut(I) and is just the group Γ of collineations over I except when I=GLn(q) is the isometry group of a trivial form, in which case A is the subgroup of Aut(I) generated by Γ and the inverse-transpose automorphism of I. For later purposes, we set A(I):=A.

It follows from [2, Theorem 2.1.4] that if the untwisted Lie rank of Ω¯ is at least 5 (this is just to exclude the groups C2(2f)=Sp4(2f) and D4(q)=Ω8+(q)), then every automorphism of Ω¯ is induced by the restriction to Ω of an automorphism of I from A=A(I), as required in Lemma 3.2.1. Furthermore, as Larsen and Shalev observe in [3, beginning of Section 3], we always have |I|(|F|-1)(r+1)|Ω¯|, where r is the untwisted Lie rank of Ω¯. They also observe that for every ϵ>0, |Ω¯|ϵ(r+1)(|F|-1) if r is sufficiently large. This “sufficiently large” can be made explicit.

Lemma 3.2.4.

For all ϵ>0, the following holds: With notation as above, for all simple Ω¯, if rϵ-1, then |Ω¯|ϵ(r+1)(|F|-1), and so |I||Ω¯|1+ϵ.

Proof.

This follows from

(3.1)|Ω¯|1/r(r+1)(|F|-1),

verifiable in each of the six cases Ω¯=Ar(q),Br(q),Cr(q),Dr(q),Ar2(q),Dr2(q) using the known formula for |Ω¯|; note that formula (3.1) does not hold for the particular choice Ω¯=A22(2), but this is not a problem, since A22(2) is not simple. ∎

We can now show the following:

Lemma 3.2.5.

Let N,η>0 and let G be a class of finite groups consisting only of the isometry groups I associated with the members of a subclass H of the class of finite classical simple Lie-type groups Ω¯ of untwisted Lie rank at least max{5,2(1+d-η)/η}. Further, assume that G is (A,N,η)-nice for w. Then H is (Aut,N,η/2)-nice for w.

Proof.

By the assumption on the untwisted Lie rank of members of , the assumptions of Lemma 3.2.1 with C:=1 are satisfied; more precisely, fixing an element Ω¯:

  1. Since the untwisted Lie rank of Ω¯ is at least 5, by the observations before Lemma 3.2.4, considering the associated isometry group I𝒢, Ω¯ is a characteristic section of I such that every automorphism of Ω¯ “comes from” an automorphism from A(I)Aut(I).

  2. Furthermore, since the untwisted Lie rank of Ω¯ is at least 2(1+d-η)/η=ϵ-1, by Lemma 3.2.4, we also have |I||Ω¯|1+ϵ.

Hence we are done by an application of Lemma 3.2.1. ∎

We now give the aforementioned theorem to which Theorem 3.1.2 reduces:

Theorem 3.2.6.

Let w be a reduced word of length l1 in d distinct variables, and let M be as in Notation 1.1.1. Then the following hold:

  1. The class of finite symmetric groups is (Aut,16l16e16Md-2!,1/(8M))-nice for w.

  2. The class of isometry groups associated with the finite simple groups of Lie type of untwisted Lie rank at least 72(d+1)2l2 is (A,1,1/(36(d+1)l2))-nice for w.

Let us actually derive Theorem 3.1.2 from this.

Proof of Theorem 3.1.2 using Theorem 3.2.6.

(1) Applying Lemma 3.2.3, we get from Theorem 3.2.6 (1) that the class of finite alternating groups has the following niceness tuple for w:

(Aut,max{16l16e16Md-2!,21+2(1+d-η)/η},1/(16M)),

where η=1/(8M). It is not difficult to check that the second term in the maximum expression in the second entry of the tuple is smaller than the first term, and we are done.

(2) By Lemma 3.2.5 and Theorem 3.2.6 (2), we only need to check that for η=1/(36(d+1)l2), we have 2(1+d-η)/η72(d+1)2l2, which is elementary. ∎

3.3 First part of the proof of Theorem 3.2.6: Symmetric groups and isometry groups other than general linear groups

We now turn to the proof of Theorem 3.2.6, which as mentioned before, is a modification of an argument by Larsen and Shalev from [3]. Let us first make some general observations which will be used in the proof.

Note that each of the abstract groups G with which Theorem 3.2.6 deals can actually be viewed as a permutation group, acting on a set Δ, in a natural way: each symmetric group 𝒮n through its natural action on the set {1,,n}, and each isometry group through its action on the corresponding vector space. Larsen and Shalev also exploited this fact, and their argument consisted essentially in investigating to what extent a relation of the form w(g1,,gd)=g for gG fixed imposes restrictions on g1,,gdG when viewed as maps ΔΔ.

In our setting, this gets more complicated because we are actually considering not a word equation in g1,,gd, but a word equation in various images of the gi under fixed automorphisms of G from some subgroup A(G)Aut(G). Hence it would be useful if, from some single piece of mapping information of the form α(gi)(x)=y, αA(G) and x,yΔ, we could derive such a condition on gi itself. It turns out that this is actually possible for the G with which we are concerned except for the case G=GLn(q), which will require some separate treatment.

In this subsection, we deal with the G not isomorphic with any GLn(q).

Notation 3.3.1.

We introduce the following notation:

  1. As G=𝒮n with n7 is complete, for each automorphism α of G, there is a unique σG such that α=conj(σ). We set t(α):=σ-1𝒮n=𝒮Δ with Δ={1,,n}.

  2. Let G=I be the isometry group of either a perfect symmetric, perfect anti-symmetric or perfect Hermitian pairing of a finite vector space Δ=𝔽qem, where e=1 in the symmetric and anti-symmetric case and e=2 in the Hermitian case. By the definition of A(I)Aut(I) above, it is clear that every element αA(I) is of the form conj(U)aut(σ), where UI and aut(σ) is a field automorphism of I, induced by an automorphism σ of 𝔽qe. This automorphism σ also induces a permutation perm(σ) on Δ, namely the map ΔΔ,(x1,,xm)(σ(x1),,σ(xm)). We set

    t(α):=(Uperm(σ))-1𝒮Δ.

The point behind Notation 3.3.1 is that in each of the cases considered, the automorphism α of G can be seen as the restriction of the conjugation by t(α)-1𝒮Δ to the subgroup G of 𝒮Δ. Hence the following is clear:

Lemma 3.3.2.

Let G be Sn for some n7 respectively an isometry group as in Notation 3.3.1(2). Then for every αAut(G) (respectively αA(G)), for every gG, and for all x,yΔ, the set on which G acts naturally, we have α(g)(x)=y if and only if g(t(α)(x))=t(α)(y).∎

At last, we are now ready to discuss the proofs of Theorem 3.2.6 (1) and of Theorem 3.2.6 (2) except for general linear groups; we will present these proofs one after the other.

Proof of Theorem 3.2.6 (1).

Let G=𝒮n with n16l16e16Md-2. We need to show that 𝔓w(G)|G|d-1/(8M). By Lemma 2.1 (2), we know that

𝔓w(G)=𝔓w(G,1),

so we only need to bound the maximum size of the fiber Φ of 1=id under an automorphic word map on G. Hence fix automorphisms α1,,αlAut(G)G, and, as usual, write w=x1ϵ1xlϵl with ϵi{±1}, x1,,xl{X1,,Xd} and ι:{1,,l}{1,,d} such that xi=Xι(i).

We associate with each fixed d-tuple g=(g1,,gd)Gd a certain metric dg(α1,,αl) on Δ={1,,n}, as follows: For y,zΔ, if z can be obtained from y through a finite number of applications of permutations on Δ, each of one of the two forms

  1. αi(gj)±1, where i{1,,l} and j{1,,d}, or

  2. t(αi)±1, where i{1,,l},

then dg(α1,,αl)(y,z) is defined as the smallest number of such function applications which it takes to pass from y to z. If, on the other hand, z cannot be obtained from y in this way, we set dg(α1,,αl)(y,z):=n. It is easy to check that dg(α1,,αl) really is a metric on Δ.

We call elements y,zΔindependent if and only if there do not exist α,β{α1,,αl} such that t(α)(y)=t(β)(z), otherwise we call them dependent. Furthermore, we define

uj:=xl-j+1ϵl-j+1xlϵl,j=0,,l

(the terminal segments of w), so that xl-jϵl-juj=uj+1. Finally, we set

νj:=(uj)G(αl-j+1,,αl)(g)G,j=0,,l,

and L:=n/(4M).

Now let z=(z1,,zL) denote an ordered L-tuple of elements Δ. We consider two sets X and X:

X:={(g,z)Gd×ΔLdg(α1,,αl)(zi,zj)>2l+2 for ij, and 
(3.2)|{zi,ν1(zi),,νl(zi)}|l for all i}

and

X:={(g,z)Gd×ΔLdg(α1,,αl)(zi,zj)>2l+2 for ij, and 
for all i there exist j1,j2{0,,l} such that
(j1j2, and νj1(zi) and νj2(zi)
(3.3) are dependent)}.

Note that the second condition, |{zi,ν1(zi),,νl(zi)}|l, in formula (3.2) just means that two of the elements zi,ν1(zi),,νl(zi) are equal, which is a stronger condition than the second condition in formula (3.3). Hence XX. Our goal is to determine an upper bound on |X|, and to this end, we bound |X|.

We begin by fixing two L-tuples (a1,,aL) and (b1,,bL) of non-negative integers such that ai<bil for all i, as well as two L-tuples (γ1,,γL) and (δ1,,δL) with entries from the set {α1,,αl}. There are fewer than l4L choices for this.

For each such choice, we count only the elements (g,z) of X such that for all iL, the elements zi,ν1(zi),,νbi-1(zi)Δ are pairwise independent, while the dependence relation t(γi)(νbi(zi))=t(δi)(νai(zi)) holds. There are fewer than nb1++bL ways of choosing ordered tuples Z1,,ZL with entries from Δ and of length b1,,bL, respectively, such that the entries of each tuple are pairwise independent. For fixed (Z1,,ZL), we count only elements of XZ1,,ZL, i.e., only elements of X as specified above such that additionally,

(zi,ν1(zi),,νbi-1(zi))=Zifor each i=1,,L.

Now the distance condition in formula (3.3) implies that if any coordinate of Zi is in dependence with any coordinate of Zj for ij, then XZ1,,ZL=. We may therefore assume that coordinates of Zi and Zj, ij, are always independent, a feature which we call the inter-independence of the Zi.

Note that for each i{1,,L} and each j{1,,bi}, we get the following condition on one of the functions g1,,gd:ΔΔ:

  1. if ϵl-j+1=+1 and ι(l-j+1)=k, then αl-j+1(gk)(νj-1(zi))=νj(zi), or equivalently (by Lemma 3.3.2) gk(t(αl-j+1)(νj-1(zi)))=t(αl-j+1)(νj(zi)).

  2. if ϵl-j+1=-1 and ι(l-j+1)=k, then αl-j+1(gk)(νj(zi))=νj-1(zi), or equivalently gk(t(αl-j+1)(νj(zi)))=t(αl-j+1)(νj-1(zi)).

Let us introduce some terminology for conditions of the form f(x)=y, where f is a variable standing for a function ΔΔ and x,yΔ are fixed. We call x the argument and y the image in the condition f(x)=y. Call two such conditions f(x1)=y1 and g(x2)=y2independent if and only if either f and g are distinct variables or f=g and x1x2. Two conditions that are not independent are called dependent. Finally, the conditions f(x1)=y1 and g(x2)=y2 are called contradictory if and only if f=g, x1=x2 and y1y2.

Equipped with this terminology, we note that for fixed i, either are two of the bi conditions on the gk derived above contradictory (so that XZ1,,ZL= in this case as well), or the conditions are pairwise independent. To see this, note that if the conditions are not pairwise independent, then since we are assuming that zi,ν1(zi),,νbi-1(zi) are pairwise independent elements of Δ (in the sense defined before formula (3.2)), the existing pair of dependent conditions is unique, and one of the two conditions has an image of the form t(α)(νj(zi)) with 1jbi-1, and the other condition is

gk(t(αl-bi+1)(νbi(zi)))=t(αl-bi+1)(νbi-1(zi)).

Now since no two consecutive terms in the sequence xlϵl,,x1ϵ1 are mutually inverse in the corresponding free group, we must have j<bi-1, but this, again by the pairwise independence of zi,ν1(zi),,νbi-1(zi), shows that the images in the two conditions cannot be equal, and so the conditions are contradictory, as we wanted to show.

We may thus assume that for fixed i, the bi conditions listed above are pairwise independent, and the inter-independence of the Zi then guarantees us that actually all the b1++bL conditions described above are pairwise independent. As the number of elements of XZ1,,Zl is bounded from above by the number of d-tuples of functions ΔΔ satisfying all the b1++bL conditions above, we conclude that |XZ1,,Zl|ndn-b1--bL. It follows that

(3.4)|X||X|l4Lndn.

To get an upper bound on the size of Φ, the fiber of id under wG(α1,,αl), from this, note that for each gGd lying in that fiber, we have

({g}×ΔL)X={(g,z){g}×ΔLdg(α1,,αL)(zi,zj)>2l+2,ij}.

Now the ball B2l+2(z) of radius 2l+2 with respect to the metric dg(α1,,αl) around any zΔ has, by definition of M, cardinality at most M. Furthermore, by definition of L, LMn/2. Hence if we select z1,,zLΔ iteratively so that for each j=1,,L,

zj{zΔdg(α1,,αl)(zj,zi)>2l+2 for all i<j},

then the number of possibilities for zj is at least

|Δi=1j-1B2l+2(zi)|n2.

It follows that

|({g}×ΔL)X|(n2)L.

Hence we also have a lower bound on the cardinality of X:

(3.5)|X||(Φ×ΔL)X||Φ|(n2)L.

Combining formula (3.5) with the upper bound on |X| from formula (3.4), we conclude that

|Φ|(n2)-Ll4Lndn
=(24l)4Lndn-L
(3.6)(24l)n/Mn1+(d-1/(4M))n.

From the explicit Stirling-like bound n!(n/e)n (which, as noted in [6], is an immediate consequence of the Taylor expansion of the exponential function), it is clear from formula (3.6) that |Φ||G|d-1/(8M) as long as

(3.7)(24l)1/Mn1/n+d-1/(4M)(ne)d-1/(8M).

But our assumption n16l16e16Md-2=(2l2)8e16Md-2 is equivalent to (24l)1/Med-1/(8M)n1/(16M), by which it is easy to verify (3.7). ∎

For the other proof, we require the following lemma, which is essentially [3, Lemma 3.2].

Lemma 3.3.3.

Let G be the isometry group, acting naturally on a finite vector space Δ, associated with a classical finite simple group of Lie type S=X(q). Set E:=Fq, and denote by F the finite field such that Δ is an F-vector space (recall that either F=E or, in the Hermitian case, F is a quadratic extension of E). Set n:=dimE(Δ), and let v1,,vk be E-linearly independent vectors in V such that n2k+2. Then |StabG(v1,,vk)|qk2+k-kn|G|.

Proof of Theorem 3.2.6 (2) except for general linear groups.

Let G be the isometry group of either a perfect symmetric, perfect anti-symmetric or perfect Hermitian pairing on a finite F-vector space Δ. In the first two cases, set E:=F, and in the Hermitian case, let E be the unique subfield of F such that [F:E]=2. Furthermore, set q:=|E| and n:=dimE(Δ) as well as m:=dimF(Δ)=n/e (with e as in Notation 3.3.1 (2)), so that without loss of generality Δ=𝔽qem and Notation 3.3.1 (2) is applicable. Finally, fix α1,,αlA(G).

Under these assumptions, we will actually show something stronger than what is asserted in Theorem 3.2.6 (2) for all isometry groups (including the general linear groups), namely that if n216l2, then the size of the fiber Φ of 1G=id under wG(α1,,αl) is at most |G|d-1/(72l2) (note that it is sufficient to consider that fiber by Lemma 2.1 (2), as A(G) contains Inn(G)). As before, the argument is a modification of a proof of Larsen and Shalev, namely of [3, proof of Proposition 3.3]. Compared to their situation, we have the advantage that we only need to consider the fiber of id, not of any isometry with an eigenvalue of multiplicity at least n/3, so that some parts of the construction even get simpler, while others get more complicated to make them still work for automorphic word maps.

Let g=(g1,,gd) denote a d-tuple of elements of G. We define uj and νj by the same formulas as in the proof of Theorem 3.2.6 (1) above. Furthermore, we set L:=n/(9l2) and let z=(z1,,zL) denote an L-tuple of elements of Δ. We define the lexicographic order on the set {1,,L}×{0,,l} through (i,j)(i,j) if and only if i<i, or i=i and j<j. Finally, we define

X:={(g,z)Gd×ΔLzi=ν0(zi)Span(i,j)(i,0),k=1,,lt(αk)(νj(zi))
and νl(vi)Span(i,j)(i,l),k=1,,lt(αk)(νj(zi)))
for all i,

where here and in the rest of this proof, for a subset AΔ, SpanA denotes the E-span of A inside Δ.

For each (g,z)X, we define bi to be the smallest positive integer such that

(3.8)νbi(zi)Span(i,j)(i,bi),k=1,,lt(αk)(νj(zi)).

Note that 1bil, and so b1++bLlL. We make formula (3.8) more explicit by fixing ai,i,j,kE=𝔽q such that

νbi(zi)=(i,j)(i,bi),k=1,,lai,i,j,kt(αk)(νj(zi)).

There are fewer than ql2L2lL ways in which the ai,i,j,k and bi can be chosen:

  1. precisely lL ways for the choice of bi,

  2. and less than the following number of ways for the choice of the scalars ai,i,j,k from E=𝔽q:

    ql(b1+(l+b2)+(2l+b3)++((L-1)l+bL))ql(lL+lL(L-1)/2)=ql2L(1+(L-1)/2)<ql2L2.

Furthermore, there are fewer than qn(b1++bL) possibilities for the sequence of sequences

z¯=(z1,,νb1-1(z1);z2,,νb2-1(z2);;zl,,νbL-1(zL))

such that none of the vectors in the sequence lies in the E-span of all the vectors obtained by applying one of the t(αk), k=1,,l, to one of the previous vectors in the sequence.

We estimate the number of elements (g,z) of X for fixed choices of ai,i,j,k, bi and z¯. Note that z is already fixed now as a part of z¯, so we need to bound the number of matching g=(g1,,gd)Gd. Say αk=conj(Uk)aut(σk) for k=1,,l, where UkG and σk is an automorphism of F=𝔽qe. Note that by the definition of t(αk) in Notation 3.3.1 (2), the map t(αk):ΔΔ is F-semilinear (in the sense of [2, bottom of p. 9]); more precisely, we have, for all v,wΔ and all λF=𝔽qe,

t(αk)(v+w)=t(αk)(v)+t(αk)(w)

and

t(αk)(λv)=σk-1(λ)t(αk)(v).

Also, note that if λE, then σk-1(λ)E as well.

We get the following b1++bL conditions on the gk. For each i=1,,L:

  1. for each j=1,,bi-1:

    1. if ϵl-j+1=+1 and ι(l-j+1)=k, then

      αl-j+1(gk)(νj-1(zi))=νj(zi),

      which by Lemma 3.3.2 is equivalent to

      gk(t(αl-j+1)(νj-1(zi)))=t(αl-j+1)(νj(zi)).
    2. if ϵl-j+1=-1 and ι(l-j+1)=k, then

      αl-j+1(gk)(νj(zi))=νj-1(zi),

      which by Lemma 3.3.2 is equivalent to

      gk(t(αl-j+1)(νj(zi)))=t(αl-j+1)(νj-1(zi)).

  2. if ϵl-bi+1=+1 and ι(l-bi+1)=k, then

    αl-bi+1(gk)(νbi-1(zi))=(i,j)(i,bi),o=1,,lai,i,j,oνj(zi),

    which by Lemma 3.3.2 and the semilinearity of the t(αk) is equivalent to

    gk(t(αl-bi+1)(νbi-1(zi)))=(i,j)(i,bi),o=1,,lσl-bi+1-1(ai,i,j,o)t(αl-bi+1)(νj(zi)).
  3. if ϵl-bi+1=-1 and ι(l-bi+1)=k, then

    αl-bi+1(gk)((i,j)(i,bi),o=1,,lai,i,j,oνj(zi))=νbi-1(zi),

    which is equivalent to

    gk((i,j)(i,bi),o=1,,lσl-bi+1-1(ai,i,j,o)t(αl-bi+1)(νj(zi)))=t(αl-bi+1)(νbi-1(zi)).

Like in the proof of Theorem 3.2.6 (1), we now argue that this system of conditions of the form gk(v)=w is either contradictory (i.e., not satisfiable for any choice of the gk in EndF(Δ)) or the conditions are independent, meaning here that for each k, the family, indexed by the conditions concerning gk, of all vectors appearing as arguments in one of the conditions concerning gk is E-linearly independent.

Indeed, assume that for some k, the family of argument vectors for gk is E-linearly dependent. Note that the lexicographical order which we defined on {1,,L}×{0,,l} also induces a linear order on the conditions involving the variable gk, as each such condition is by definition associated with a pair (i,j){1,,L}×{0,,l} in an injective way (for a condition as described in the last two bullet points above, this pair is (i,bi)). By means of this linear order, list the conditions involving gk as follows:

gk(v1)=w1,gk(v2)=w2,,gk(vtk)=wtk.

Since the tuple (v1,,vtk) is E-linearly dependent by assumption and all vtr are nonzero, there exists an index u{2,,tk} such that vuSpan{v1,,vu-1}. Note that if the system of conditions is satisfiable through a suitable choice of g1,,gdEndF(Δ), this implies that likewise wuSpan{w1,,wu-1}. We will now argue that this is not the case.

By the choice of z¯, the assumption that vuSpan{v1,,vu-1} implies that gk(vu)=wu must be a condition as described in the third bullet point above, with

vu=(i,j)(i,bi),o=1,,lσl-bi+1-1(ai,i,j,o)t(αl-bi+1)(νj(zi))

and

wu=t(αl-bi+1)(νbi-1(zi))

(and thus ϵl-bi+1=-1). Using the fact that no two consecutive terms in the sequence xlϵl,,x1ϵ1 are mutually inverse in the corresponding free group, we get that none of the conditions gk(v1)=w1,,gk(vu-1)=wu-1 is associated with the pair (i,bi-1), and the assertion that wuSpan{w1,,wu-1} now follows again by choice of z¯.

Hence we may assume without loss of generality that the above described b1++bL conditions on the gk are independent, so that by Lemma 3.3.3 and the convexity of the function rr2+r, we see that there are no more than

q(b1++bL)2+(b1++bL)-(b1++bL)n|G|d

elements of X, subject to the choices of ai,i,j,k, bi and z¯. Hence

(3.9)|X|lLql2L2+2(b1++bL)2|G|dlLql2L2+2l2L2|G|d=lLq3l2L2|G|d.

On the other hand, if gGd lies in Φ, then for all zΔL, (g,z) is an element of X if and only if for all i=1,,L, the condition

(3.10)ziSpan(i,j)(i,0),k=1,,lt(αk)(νj(zi))

is satisfied. Now for each i, the span on the right-hand side of formula (3.10) has E-dimension less than l2Ln/9n-1 and thus is a proper E-subspace of Δ. It follows that in each step of iteratively fixing an L-tuple (z1,,zL)ΔL according to formula (3.10), we have at least qn-1 many choices for zi. Hence the number of pairs (g,z)X with gΦ fixed is at least qL(n-1), and it follows that

(3.11)|X||Φ|qL(n-1).

Combining formulas (3.9) and (3.11), we get

|Φ|lLq3l2L2-L(n-1)|G|d
ln/(9l2)q3l2n2/(81l4)-n/(9l2)n/2|G|d
qnqn2(1/(27l2)-1/(18l2))|G|d
=qn-n2/(54l2)|G|d
=(qn2)1/n-1/(54l2)|G|d
(qn2)1/(216l2)-1/(54l2)|G|d
=(qn2)-1/(72l2)|G|d|G|d-1/(72l2),

where in the last step, we used the fact that |G|qn2, which in the symmetric and anti-symmetric cases is trivial since GGLn(q) then, and in the Hermitian case, it follows from

|G|=|GUn(q)|=qn(n-1)/2(qn-(-1)n)(qn-1-(-1)n-1)(q2-1)(q+1),

see, for example, [1, p. x]. ∎

3.4 Second part of the proof of Theorem 3.2.6: General linear groups

As mentioned before, for the general linear groups G=GLn(q), the argument used for the other isometry groups from Theorem 3.2.6 (2) needs to be modified. This is because the automorphisms of G which can be written as conj(U)aut(σ) for some UG and σAut(𝔽q) only form an index 2 subgroup, hitherto denoted by B(GLn(q))=B(G), in A(G). A representative for the other coset of B(G) in A(G) is the inverse-transpose automorphism τ:U(U-1)t=(Ut)-1. This also means that it is not possible in general to rewrite a condition of the form

α(g)(v)=w

with αA(G) equivalently into one of the form

g(t(α)(v))=t(α)(w)

as before. However, it is easy to see that we can at least rewrite each such condition equivalently into one of two possible forms.

Lemma 3.4.1.

Let G=GLn(q) for some nN+ and prime power q, and let Δ:=Fqn, an Fq-vector space on which G acts naturally. Further, let αA(G), gG and x,yΔ. Then the following hold:

  1. If αB(G), say α=conj(U)σ, then setting t(α):=(Uperm(σ))-1𝒮Δ just as in Notation 3.3.1(2), we have that

    α(g)x=y

    is equivalent to

    gt(α)(x)=t(α)(y).
  2. If αA(G)B(G), say α=βτ with β=conj(U)σ, then

    α(g)x=y

    is equivalent to

    gtt(β)(y)=t(β)(x).

Proof.

The argument for point (1) is like the one for Lemma 3.3.2: that α can be viewed as the restriction of the inner automorphism conj(t(α)-1):𝒮Δ𝒮Δ to G𝒮Δ.

As for point (2), note that

α(g)x=yβ((gt)-1)x=yβ(gt)-1x=yβ(gt)y=xgtt(β)(y)=t(β)(x),

as required. ∎

In view of this, the following lemma will act as a substitute for Lemma 3.3.3:

Lemma 3.4.2.

Let nN+, q a prime power, and let r1,r2N with r1,r2n. Let

v1(1),,vr1(1),w1(1),,wr1(1),v1(2),,vr2(2),w1(2),,wr2(2)𝔽qn

such that v1(1),,vr1(1) are Fq-linearly independent and v1(2),,vr2(2) are Fq-linearly independent. Then the number of gMatn(q) such that

gvi(1)=wi(1)for i=1,,r1

and

gtvj(2)=wj(2)for j=1,,r2

is at most qn2-(r1+r2)n+r1r2.

Proof.

Fix TGLn(q) such that vi(1)=T-1ei for i=1,,r1, where ei denotes the i-th “standard basis vector” of 𝔽qn (which has i-th entry 1 and all other entries 0). Then for i=1,,r1, the condition gvi(1)=wi(1) is equivalent to

(3.12)hei=yi(1),

where h:=TgT-1 and yi(1):=Twi(1). Furthermore, for j=1,,r2, the condition gtvj(2)=wj(2) is equivalent to

(3.13)htxj(2)=yj(2),

where xj(2):=(T-1)tvj(2) and yj(2):=(T-1)twj(2). Instead of counting the number of gGLn(q) satisfying the r1+r2 many mapping conditions from the assumptions, we count the number of hGLn(q) satisfying all the equivalently rewritten conditions from formulas (3.12) and (3.13).

To this end, note that each of the conditions hei=yi(1), i=1,,r1, completely determines one of the first r1 many columns of the matrix h.

Note further that, since the xj(2)=(T-1)tvj(2), j=1,,r2, are 𝔽q-linearly independent, there exist indices t1,t2,,tr2 with 1t1<t2<<tr2n such that for j=1,,r2, a suitable 𝔽q-linear combination of x1(2),,xr2(2) is a vector zj whose ij-th coordinate is 1 and whose ik-th (k{1,,r2}{j}) coordinate is 0. Hence the conditions htxj(2)=yj(2), j=1,,r2, together imply conditions of the form

(3.14)htzj=uj,

where uj is a suitable linear combination of y1(2),,yr2(2). However, by the conditions from formula (3.14), the rows indexed by t1,,tr2 of h can be expressed as 𝔽q-linear combinations of the rows of h whose index is not from {t1,,tr2}.

Combining the two statements about how the conditions affect coefficients from h, we see that h is completely determined by the conditions from formulas (3.12) and (3.13) if we additionally fix the coefficients of h that lie neither in one of the first r1 many columns nor in one of the rows indexed by t1,,tr2 of h. As there are precisely n2-(r1+r2)n+r1r2 such coefficients of h, there are at most qn2-(r1+r2)n+r1r2 many hGLn(q) that satisfy the conditions from formulas (3.12) and (3.13), as required. ∎

Proof of Theorem 3.2.6 (2) for general linear groups.

Let G=GLn(q), let n72(d+1)2l2, and fix automorphisms α1,,αlA(G). We want to show that the size of the fiber Φ of 1G=id under wG(α1,,αl) is at most |G|d-1/(36(d+1)l2). As the argument is a modification of the one for the other isometry groups given at the end of the last subsection, we will only indicate at which points the argument needs to be altered here.

(i) Instead of L:=n/(9l2), we set L:=n/(3(d+1)l2) here.

(ii) As we said at the beginning of this subsection, we cannot write αk in the form αk=conj(Uk)aut(σk) anymore in general, but we can write

αk=conj(Uk)aut(σk)τak,

where ak{0,1}.

(iii) Accordingly, we use Lemma 3.4.1 for the equivalent reformulation of the mapping conditions on the gk, and we can show, by an analogous argument, that each of the 2d argument vector families belonging to one of gk or gkt, where k=1,,d, is linearly independent.

(iv) Hence if we denote, for k=1,,d, the number of rewritten conditions involving gk by r1(k) and the number of those conditions involving gkt by r2(k), then an application of Lemma 3.4.2 yields that the number of elements of X, subject to the choices of ai,i,j,k, bi and z¯, is at most

qdn2-(b1++bL)n+r1(1)r2(1)++r1(d)r2(d)qdn2-(b1++bL)n+dl2L2.

Hence we get the following upper bound on |X| here:

|X|lLql2L2+dn2+dl2L2=lLq(d+1)l2L2(qn2)d.

Now

|G|=|GLn(q)|=(qn-1)(qn-q)(qn-qn-1)(qn-1)n=qn(n-1),

and so qn2|G|n/(n-1)=|G|1+1/(n-1)qn2/(n-1)|G|. Therefore,

|X|lLq(d+1)l2L2+dn2/(n-1)|G|d.

The lower bound on |X| is still the same as in formula (3.11).

(v) Note that since we are assuming that n72(d+1)2l2, it is easy to check that

(3.15)2d+1n-118(d+1)l2-136(d+1)l2.

Hence by combining the upper and lower bound on |X|, we get the following:

|Φ|lLq(d+1)l2L2-L(n-1)+dn2/(n-1)|G|d
ln/(3(d+1)l2)q(d+1)l2n2/(9(d+1)2l4)-n/(3(d+1)l2)n/2+2nd|G|d
qn(qn2)(d+1)l2/(9(d+1)2l4)-1/(6(d+1)l2)+2d/n|G|d
=(qn2)1/(9(d+1)l2)-1/(6(d+1)l2)+(2d+1)/n|G|d
=(qn2)(2d+1)/n-1/(18(d+1)l2)|G|d
(qn2)-1/(36(d+1)l2)|G|d|G|d-1/(36(d+1)l2),

where the second-to-last (i.e., the first in the last row) is by formula (3.15). ∎

4 Proof of Theorem 1.2.2

For proving Theorem 1.2.2, we are supposed to exclude certain nonabelian finite simple groups as composition factors of a finite group G satisfying the condition 𝔭w(G)ρ for some fixed nonempty reduced word w and ρ(0,1].

Assume that S is a nonabelian composition factor of G. By Lemma 2.1 (2), we know that 𝔭w(G)𝔭w(N)𝔭w(G/N) whenever N is characteristic in G. It follows that ρ𝔭w(G)i=1r𝔭w(Fi)mini=1,,r𝔭w(Fi), where F1,,Fr are the characteristic composition factors of G, i.e., the factors in any principal characteristic series of G (see [4, p. 65]), counted with multiplicities. As each Fi is characteristically simple and thus of the form Sini for some finite simple group Si and ni+ by [4, Result 3.3.15, p. 87], there must exist i{1,,r} such that Si=S. Hence we can derive from the assumption that S is a composition factor of G that 𝔭w(Sn)ρ for some n+.

Our next goal on the way to the proof of Theorem 1.2.2 thus is to study 𝔭w(T), where T=Sn is a finite nonabelian characteristically simple group. In Lemma 4.4 below, we will show that 𝔭w(Sn)maxw𝔭w(S), where w runs through a finite set of words associated with w, the so-called “variations of w”:

Definition 4.1.

Let w=x1ϵ1xlϵl=Xι(1)ϵ1Xι(l)ϵl be a reduced word of length l in the variables X1,,Xd. For k=1,,d, denote by ak the number of occurrences of Xk±1 in w (so that a1++ad=l). A variation of w is a word of the form Xι(1),t1ϵ1Xι(l),tlϵl, where ti{1,,aι(i)} for i=1,,l.

Hence a variation of w is a word w of the same length as w and in variables of the form Xk,t with k{1,,d} and t{1,,ak} that is obtained from w by adding second indices to each occurrence of Xk±1, k=1,,d, in w such that each second index is from the “admissible range”, i.e., lies somewhere between 1 and the number ak of occurrences of Xk±1 in w.

Example 4.2.

Consider the commutator word w=[X1,X2]=X1X2X1-1X2-1. Then X1,2X2,1X1,1-1X2,1-1 is a variation of w. The word X1,3X2,1X1,2-1X2,2-1, however, is not a variation of w, since the second index 3 added to the first variable X1 does not lie within the admissible range {1,2}.

Remark 4.3.

Some simple observations concerning variations:

  1. Each variation of a reduced word of length l is again a reduced word of length l.

  2. Each reduced word w only has finitely many variations. More precisely, if w is a reduced word in the variables X1,,Xd, and Xi±1 occurs precisely ai times in w for i=1,,d, then the number of variations of w is precisely k=1dakak.

  3. Each reduced word w can be obtained from each of its variations w by substituting Xk for Xk,t, t=1,,ak, in w. Hence for each finite group G and each variation w of w, πw(G)=1 implies πw(G)=1, and 𝔭w(G)=1 implies 𝔭w(G)=1.

Lemma 4.4.

Let w be a reduced word of length l1 in the variables X1,,Xd, S a nonabelian finite simple group and nN+. Set

ϵ=ϵ(S,w):=maxw𝔭w(S)(0,1],

where w runs through the variations of w. Then

𝔭w(Sn)ϵn/l2ϵ.

We note that for the proof of our main results, we will only require the weaker inequality 𝔭w(Sn)ϵ from Lemma 4.4, but the stronger one, 𝔭w(Sn)ϵn/l2, will be used in Section 5 in the proof that Conjecture 5.2 implies Conjecture 5.3.

Proof of Lemma 4.4.

Fix automorphisms α1,,αl of Sn and an element g=(g1,,gn) of Sn. By [4, Section 3.3.20, p. 90], we know that

Aut(Sn)=Aut(S)𝒮n,

and so for i=1,,l, we can write αi=(αi,1××αi,n)σi, where each αi,j is an automorphism S and σi is a coordinate permutation on Sn.

Let s1=(s1,1,,s1,n),,sd=(sd,1,,sd,n), where each sk,j is a variable ranging over S, so that each sk can be viewed as a variable element of Sn. We want to bound the number of solutions in (Sn)dSnd of the equation

(4.1)wSn(α1,,αl)(s1,,sd)=g.

As usual, let us write w=x1ϵ1xlϵl=Xι(1)ϵ1Xι(l)ϵl. By computing the left-hand side in formula (4.1) and comparing the entries of the vectors on both sides of the resulting equation, we see that the equation in formula (4.1) is equivalent to the conjunction of the following n “coordinate equations”, for i=1,,n:

α1,i(sι(1),σ1-1(i))ϵ1αl,i(sι(l),σl-1(i))ϵl=gi.

The left-hand side of each of these equations is, up to a suitable renaming of the variables, the evaluation of an automorphic word map associated with a variation of w in variables ranging over S. In particular, if Ji denotes the set of those variables sk,j that are mentioned in the i-th coordinate equation, then that same equation implies that if we project the solution set Φ to the equation in (4.1) onto those coordinates that correspond to variables from Ji, the resulting image has size at most ϵ|S||Ji|.

Our goal is to find n/l2 pairwise distinct indices i1,,in/l2{1,,n} such that the associated coordinate equations are pairwise independent, i.e., such that JitJiu= for tu. Once we have found these indices, we are done, since it then follows that the projection of Φ onto those coordinates that correspond to variables from t=1n/l2Jit has size at most

t=1n/l2ϵ|S||Jit|=ϵn/l2|S||Ji1|++|Jin/l2|,

and thus Φ itself has size at most

ϵn/l2|S||Ji1|++|Jin/l2||S|n-(|Ji1|++|Jin/l2|)=ϵn/l2|S|n,

as required.

We choose the indices i1,,in/l2 iteratively: i1 can be chosen arbitrarily from {1,,n}. If we have already found indices i1,,it such that the associated coordinate equations are pairwise independent and we want to find another index it+1, it is sufficient to choose it+1 outside of the set i=1lσi[u=1tMu], where Mu denotes the set of second indices occurring in the iu-th equation. This set of “forbidden” values for it+1 has size at most tl2, and so as long as n>tl2, i.e., n/l2t+1, we can choose it+1 as desired. This concludes the proof. ∎

The proof of Theorem 1.2.2 is now easy:

Proof of Theorem 1.2.2.

(1) If S=𝒜m is a composition factor of G, then by the observations from the beginning of this subsection, it follows that 𝔭w(𝒜mn)ρ for some n+, and thus 𝔭w(𝒜m)ρ for some variation w of w. However, w is a reduced word of length l in at most l distinct variables, and so if

|S|=|𝒜m|>max{256l16e16Ml-2!,ρ-16M},

we get a contradiction, since this implies by Theorem 3.1.2 (1) that

ρ𝔭w(𝒜m)|𝒜m|-1/(16M)<(ρ-16M)-1/(16M)=ρ.

(2) Assume that S=Xr(q) is a (classical) simple group of Lie type with r>max{72(l+1)2l2,72(l+1)l2log2(ρ-1)} and that S is a composition factor of G. As before, it follows that 𝔭w(S)ρ for some variation w of w. In view of our choice of r, and using again the fact that w is a reduced word of length l in at most l distinct variables and that |Xr(q)|qr22r2 (which follows from the known formulas for |Xr(q)|, for example from [1, Table 6, p. xvi]), we get by Theorem 3.1.2 (2) that

ρ𝔭w(Xr(q))|Xr(q)|-1/(72(l+1)l2)2-r2/(72(l+1)l2)<ρ,

a contradiction. ∎

Proof of Theorem 1.1.2.

This follows immediately from Theorem 1.2.2, as we have Πw(G)𝔓w(G). ∎

5 Concluding remarks

As mentioned at the beginning of Section 3, the generalization of the third case in Larsen and Shalev’s proof (the simple Lie-type groups of bounded rank) from the word map setting to automorphic word maps is open. Described very briefly, Larsen and Shalev’s approach to the third case is an algebro-geometric one and consists in studying the fibers of word maps in simple Lie-type groups as subvarieties of the Lie-type groups viewed as linear algebraic groups. One of the problems with extending this approach to automorphic word maps is that because of the existence of field automorphisms on Lie-type groups, the degrees of the polynomial equations defining the fiber as a variety are, in contrast to the word map setting, in general not bounded by a constant any more.

Still, hoping that this and other difficulties can be overcome, we will spend the rest of this concluding section discussing possible consequences of a successful adaptation of the proof.

The following is a direct generalization of [3, Theorem 1.1] to automorphic word maps and would most likely result from a suitable adaptation of Larsen and Shalev’s proof in its entirety:

Conjecture 5.1.

For each nonempty and reduced word w in d distinct variables, there exist constants N(w),η(w)>0 such that for all nonabelian finite simple groups S with |S|N(w), the inequality 𝔓w(S)|S|d-η(w) holds.

Consider also the following slightly stronger version of Conjecture 5.1:

Conjecture 5.2.

Like Conjecture 5.1, but with the additional assumption that the constants N(w) and η(w) are effective, i.e., they can be computed algorithmically from the word w as input.

Our last goal in this paper is to show that Conjecture 5.2 implies another interesting statement, given as Conjecture 5.3 below. Before this, for the reader’s convenience, we briefly review some basic facts on the solvable radical and finite groups with trivial solvable radical (for more details, readers are referred to [4, pp. 88 ff. and p. 122]), and we give some motivation.

Recall that every finite group G has a largest solvable normal subgroup, called the solvable radical of G and denoted by Rad(G). The quotient G/Rad(G) is semisimple, i.e., it has no nontrivial solvable normal subgroups at all. It can be shown that the socle Soc(H) (the subgroup generated by all the minimal nontrivial normal subgroups) of a finite semisimple group H is isomorphic with a centerless CR-group, i.e., a direct product of nonabelian finite simple groups, and that H acts faithfully on Soc(H) via conjugation, so that H is isomorphic with a subgroup of Aut(Soc(H)) containing Inn(Soc(H))Soc(H). Conversely, if R is a finite centerless CR-group, and Inn(R)GAut(R), then G is semisimple and Soc(G)=Inn(R)R. Hence the finite semisimple groups are, up to isomorphism, just those finite groups that occur in between the inner and the full automorphism group of a finite centerless CR-group.

The index [G:Rad(G)] is clearly an upper bound on the product of the orders of all the nonabelian composition factors of G (counted with multiplicities), so that deriving an upper bound on it means establishing some heavy restrictions on the structure of G.

It would be nice if we had an algorithmic method to decide in general for a given reduced word w whether a condition of the form 𝔭w(G)ρ is always strong enough to imply that [G:Rad(G)] is bounded in terms of w and ρ or not. This is the case if Conjecture 5.2 holds true.

Conjecture 5.3.

There exists an algorithm which, on input a reduced word w, achieves the following:

  1. It decides whether there exists a function gw:(0,1][1,) such that for all finite groups G and all ρ(0,1], if 𝔭w(G)ρ, then [G:Rad(G)]gw(ρ).

  2. In case such a function gw exists, it also outputs a definition for a possible choice of gw.

Proof that Conjecture 5.2 implies Conjecture 5.3.

Write

Soc(G/Rad(G))=S1n1××Srnr,

where the Si are pairwise nonisomorphic nonabelian finite simple groups. Note that each Sini is a characteristic composition factor of G, and so

ρ𝔭w(G)𝔭w(Sini)maxw𝔭w(Si)

for i=1,,r, where w runs through the variations of w.

Compute N0(w):=maxwN(w) and η0:=minwη(w), and note that necessarily

maxi=1,,r|Si|max{N0(w),ρ-1/η0(w)},

as otherwise, if |Si| is strictly larger than that maximum, it follows that

ρmaxw𝔭w(Si)|Si|-η0(w)<(ρ-1/η0(w))-η0(w)=ρ,

a contradiction.

Hence we can effectively reduce the list of nonabelian finite simple groups S that could potentially occur as a factor of Soc(G/Rad(G)) to a finite number of possibilities. There are two cases to consider:

  1. For one of those finitely many nonabelian finite simple groups S, we have 𝔭w(S)=1. In other words, there exist automorphisms α1,,αl of S such that wS(α1,,αl) is constant on Sd. Then it is easy to see that

    wSn(α1(n),,αl(n))1on (Sn)d,

    and so 𝔭w(Sn)=1 for all n+. Hence in that case, [G:Rad(G)] cannot be bounded under any of the assumptions 𝔭w(G)ρ, ρ(0,1].

  2. For each of these finitely many S, 𝔭w(S)<1. Then for every variation w of w, 𝔭w(S)<1 as well, by Remark 4.3 (3). Hence

    ϵ=ϵ(S,w):=maxw𝔭w(S)1-1|S|l.

    Therefore, by Lemma 4.4, 𝔭w(Sn)ρ implies

    nn0(w,ρ):=l2log(ρ)log(1-1/|S|l).

    It follows that |Soc(G/Rad(G))| is effectively bounded from above in terms of w and ρ, namely by S|S|n0(w,ρ), where S runs through the nonabelian finite simple groups of order at most max{N0(w),ρ-1/η0(w)}. Since G/Rad(G) embeds into Aut(Soc(G/Rad(G))), its order is thus also effectively bounded in terms of w and ρ; more precisely,

    |G/Rad(G)||Aut(S|S|n0(w,ρ))|=|SAut(S)𝒮n0(w,ρ)|
    =S(|Aut(S)|n0(w,ρ)n0(w,ρ)!).

We can thus conclude the proof by noting that it can be effectively decided which of the two cases occurs (just go through the effective finite list of groups S and check for each of them, if necessary by brute force, whether 𝔭w(S)=1). ∎


Communicated by Timothy C. Burness


Funding source: Austrian Science Fund

Award Identifier / Grant number: Project F5504-N26

Funding statement: The author is supported by the Austrian Science Fund (FWF): Project F5504-N26, which is a part of the Special Research Program “Quasi-Monte Carlo Methods: Theory and Applications”.

References

[1] J. H. Conway, R. T. Curtis, S. P. Norton, R. A. Parker and R. A. Wilson, Atlas of Finite Groups, corr. reprint, Oxford University Press, Oxford, 2103. Search in Google Scholar

[2] P. Kleidman and M. Liebeck, The Subgroup Structure of the Finite Classical Groups, London Math. Soc. Lecture Note Ser. 129, Cambridge University Press, Cambridge, 1990. 10.1017/CBO9780511629235Search in Google Scholar

[3] M. Larsen and A. Shalev, Fibers of word maps and some applications, J. Algebra 354 (2012), 36–48. 10.1016/j.jalgebra.2011.10.040Search in Google Scholar

[4] D. J. S. Robinson, A Course in the Theory of Groups, 2nd ed., Grad. Texts in Math. 80, Springer, New York, 1996. 10.1007/978-1-4419-8594-1Search in Google Scholar

[5] A. Shalev, Some results and problems in the theory of word maps, Erdös Centennial, Bolyai Soc. Math. Stud. 25, János Bolyai Mathematical Society, Budapest (2013), 611–649. 10.1007/978-3-642-39286-3_22Search in Google Scholar

[6] T. Tao, 254A, Notes 0a: Stirling’s formula, online notes (2010), https://terrytao.wordpress.com/2010/01/02/254a-notes-0a-stirlings-formula/. accessed date: October 14, 2016. Search in Google Scholar

Received: 2016-10-18
Revised: 2017-6-2
Published Online: 2017-7-20
Published in Print: 2017-11-1

© 2017 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 7.12.2025 from https://www.degruyterbrill.com/document/doi/10.1515/jgth-2017-0024/html
Scroll to top button