Home Equations involving the modular 𝑗-function and its derivatives
Article Open Access

Equations involving the modular 𝑗-function and its derivatives

  • Vahagn Aslanyan ORCID logo EMAIL logo , Sebastian Eterović ORCID logo and Vincenzo Mantova ORCID logo
Published/Copyright: October 11, 2025

Abstract

We show that, for any polynomial F ( X , Y 0 , Y 1 , Y 2 ) C [ X , Y 0 , Y 1 , Y 2 ] , the equation F ( z , j ( z ) , j ( z ) , j ′′ ( z ) ) = 0 has a Zariski dense set of solutions in the hypersurface F ( X , Y 0 , Y 1 , Y 2 ) = 0 , unless 𝐹 is in C [ X ] or it is divisible by Y 0 , Y 0 1728 , or Y 1 . Our methods establish criteria for finding solutions to more general equations involving periodic functions. Furthermore, they produce a qualitative description of the distribution of these solutions.

1 Introduction

The problem of determining which (systems of) equations involving certain classical transcendental functions of a complex variable have solutions is a natural question at the intersection between complex geometry, model theory, and number theory. In complex geometry, it is a form of analytic Nullstellensatz for the given functions; in model theory, it plays an important role in the definability properties of the functions involved; and in number theory, it is related to Schanuel’s conjecture and its analogues (given by special cases of the Grothendieck–André generalised period conjecture). Often, the function under consideration is of arithmetic importance. Examples of such classical functions are the exponential functions of semi-abelian varieties and Fuchsian automorphic functions. In this paper, we focus on the modular 𝑗-function and its derivatives.

The first conjecture in this area arose from Zilber’s work on the model theory of complex exponentiation [17, 16, 18]. It is now referred to as the Exponential (Algebraic) Closedness conjecture or Zilber’s Nullstellensatz, and predicts when systems of equations involving addition, multiplication, and complex exponentiation have solutions in the complex numbers. We refer to the general version of the problem as Existential Closedness, or EC for short. An EC conjecture for the 𝑗-function was proposed in [4, §1]; in geometric terms, it states that any algebraic variety V C 2 n satisfying geometric conditions known as freeness and broadness intersects the 𝑛-fold graph of the 𝑗-function. The definition of these geometric conditions is long and will not be used in the present work, so we refer the interested reader to [4, §2.2], but informally, freeness and broadness ensure that the equations defining 𝑉 do not break any functional properties of 𝑗 coming from the linear-fractional action of GL 2 + ( Q ) (where + denotes positive determinant) on the upper half-plane, as well as not contradicting a conjecture on transcendental values of the 𝑗-function analogous to Schanuel’s conjecture for exponentiation (see [4, Conjecture 1.1] and [3, §6.3] for the statement of this conjecture).

If one somehow knows that an algebraic variety 𝑉 does intersect the graph of 𝑗, a very natural next question is to determine how these intersection points are distributed within 𝑉. For instance, one may ask whether these points are Zariski dense in 𝑉. We remark that if V C 2 n satisfies the above-mentioned geometric conditions of freeness and broadness, then for any Zariski open subset V V , it is possible to construct an algebraic variety W C 2 ( n + 1 ) which is also free and broad and projects onto V . Thus, if we assume EC then, applying it to 𝑊, we deduce that V intersects the graph of 𝑗. Since V was an arbitrary Zariski open subset of 𝑉, we conclude that the intersection of 𝑉 with the graph of 𝑗 is Zariski dense in 𝑉.

In the same work [4], the authors also proposed an extension of the conjecture incorporating the derivatives of 𝑗 (see [4, Conjecture 1.6]). This version of EC is often referred to as Existential Closedness with Derivatives, or ECD for short. This time the variety 𝑉 in question is a subset of C 4 n , and the conjecture states that if 𝑉 satisfies analogous geometric notions of freeness and broadness (again related to a form of Schanuel’s conjecture, now involving 𝑗 and its derivatives), then 𝑉 intersects the 𝑛-fold graph of the map z ( j ( z ) , j ( z ) , j ′′ ( z ) ) . For the definitions and precise statements of these conjectures, see [4, 2, 1]. Note that we do not consider the third and higher derivatives of 𝑗 as these are rational over j , j , j ′′ (see (2.2)). As with EC, ECD implies that if V C 4 n satisfies the geometric conditions of freeness and broadness, then 𝑉 has a Zariski dense set of points of the desired form.

Very few cases of ECD have been proven, in comparison to EC where various families of varieties in C 2 n have been shown to satisfy the conjecture. Prior to the present work, only very special cases of ECD had been solved, proving solvability of some simple equations involving just j (so not combining it with 𝑗 or j ′′ ); see [9, 10]. An ECD statement for “blurrings” (certain multi-valued twists) of 𝑗 was obtained in [4]. All of these papers mostly focused on, and established stronger results for, the EC conjecture for the 𝑗-function (without derivatives). These results are analogous to their exponential counterparts, namely, [6, 7, 5, 12, 16, 11]. Although different methods have been used across these works, one common feature is that they all exploit in some way the periodicity of exp or the SL 2 ( Z ) -invariance of 𝑗. Incorporating the derivatives of 𝑗 into the equations presents then a significant new challenge, as j and j ′′ are no longer SL 2 ( Z ) -invariant. It is also worth noting that a differential version of ECD was obtained in [2], and that it is so far the only setting where a full Existential Closedness statement is proved for 𝑗 with derivatives, and the same method also applies to exp .

In this article, we prove the ECD conjecture when n = 1 , which precisely states that any algebraic variety V C 4 of dimension 3 without constant coordinates contains (a Zariski dense set of) points of the form ( z , j ( z ) , j ( z ) , j ′′ ( z ) ) . This amounts to checking exactly which equations of one complex variable involving only z , j ( z ) , j ( z ) , j ′′ ( z ) have solutions, and whether these solutions are Zariski dense. Note that, for n = 1 , broadness of 𝑉 just means dim V 3 ; hence the only non-trivial case is dim V = 3 . On the other hand, freeness for n = 1 means 𝑉 has no constant coordinates. Nevertheless, we will even be able to decide what happens when 𝑉 does have a constant coordinate.

Our first main result establishes the existence of solutions in all non-trivial cases.

Theorem 1.1

Let F ( X , Y 0 , Y 1 , Y 2 ) C [ X , Y 0 , Y 1 , Y 2 ] C [ X ] . Then the equation

F ( z , j ( z ) , j ( z ) , j ′′ ( z ) ) = 0

has infinitely many solutions.

The proof of Theorem 1.1 is based on a generalisation of the methods of [9], which use Rouché’s theorem from classical complex analysis to establish some cases of EC for 𝑗 (without derivatives). Theorem 1.1 can be seen as an analogue of the classical fact that every irreducible polynomial p ( X , Y ) C [ X , Y ] which depends on 𝑌 has infinitely many zeroes of the form ( z , exp ( z ) ) , unless p = c Y for some c C × .

Throughout the paper, all algebraic subvarieties of C 4 will be defined by polynomials in the ring C [ X , Y 0 , Y 1 , Y 2 ] .

Our main goal is to obtain a much stronger version of Theorem 1.1. We show that, for any polynomial F ( X , Y 0 , Y 1 , Y 2 ) , the set

{ ( z , j ( z ) , j ( z ) , j ′′ ( z ) ) H × C 3 : F ( z , j ( z ) , j ( z ) , j ′′ ( z ) ) = 0 }

is Zariski dense in the hypersurface F ( X , Y 0 , Y 1 , Y 2 ) = 0 , unless 𝐹 is divisible by an explicit (finite) list of polynomials. In this case, we say that the equation F ( z , j ( z ) , j ( z ) , j ′′ ( z ) ) = 0 has a Zariski dense set of solutions (see Definition 2.1), that is, by a solution of such an equation, we understand a tuple ( z 0 , j ( z 0 ) , j ( z 0 ) , j ′′ ( z 0 ) ) rather than just z 0 .

We remind the reader that this is equivalent to establishing certain cases of ECD for subvarieties of C 8 : given a hypersurface V C 4 and a Zariski open dense V V , there is W C 8 free and broad which projects onto V such that 𝑊 intersects the graph of 𝑗 and its derivatives if and only if V does. For instance, the system { j ′′ ( z ) = 0 , j ( z ) 0 } has a solution if and only if { j ′′ ( z 1 ) = 0 , j ( z 1 ) z 2 = 1 } does.

The bulk of the paper is focused on proving the Zariski density of the set of solutions, which the proof of Theorem 1.1 does not provide. For instance, the solutions of the equation

z j ′′ ( z ) + ( z 3 + 1 ) j ( z ) 2 + j ( z ) j ( z ) 7 = 0

found via the proof Theorem 1.1 are the SL 2 ( Z ) -conjugates of ρ = 1 2 + 3 2 i . These are obviously not Zariski dense, for it is well known that j ( γ ρ ) = j ( γ ρ ) = j ′′ ( γ ρ ) = 0 for every γ SL 2 ( Z ) (see [13, p. 40]). Indeed, Zariski density requires at least that the solutions are not contained in finitely many SL 2 ( Z ) -orbits, except for when the equation is of the form k ( j ( z ) u k ) = 0 for some u k C .

The zeroes of j are in fact problematic: observe that, for all z H ,

j ( z ) ( j ( z ) 1728 ) = 0 j ( z ) = 0 .

This immediately gives that the three equations j ( z ) = 0 , j ( z ) 1728 = 0 , and j ( z ) = 0 do not have Zariski dense sets of solutions. Our main result shows that these are essentially the only non-examples.

Theorem 1.2

For any polynomial F ( X , Y 0 , Y 1 , Y 2 ) C [ X , Y 0 , Y 1 , Y 2 ] C [ X ] which is coprime to Y 0 ( Y 0 1728 ) Y 1 , the equation F ( z , j ( z ) , j ( z ) , j ′′ ( z ) ) = 0 has a Zariski dense set of solutions, i.e. the set

{ ( z , j ( z ) , j ( z ) , j ′′ ( z ) ) H × C 3 : F ( z , j ( z ) , j ( z ) , j ′′ ( z ) ) = 0 }

is Zariski dense in the hypersurface F ( X , Y 0 , Y 1 , Y 2 ) = 0 .

Remark 1.3

For every rational function G ( X , Y 0 , Y 1 , Y 2 ) C ( X , Y 0 , Y 1 , Y 2 ) , Theorem 1.2 implies that the function G ( z , j ( z ) , j ( z ) , j ′′ ( z ) ) has a zero unless 𝐺 is of the form

Y 0 s ( Y 0 1728 ) t Y 1 H ( X , Y 0 , Y 1 , Y 2 ) ,

where 𝐻 is a polynomial and s , t , N .

A special case of Theorem 1.2 is that the equation j ′′ ( z ) = 0 has a Zariski dense set of solutions.[1] In this case, even proving that there is a solution outside the SL 2 ( Z ) -orbit of 𝜌 is highly non-trivial; see Section 7.1.

Remark 1.4

In [8], the author studies the problem of finding generic solutions to equations involving 𝑗 (and its derivatives) under the assumption that the system has a Zariski dense set of solutions. In particular, combining Theorem 1.2 with [8, Theorem 6.5], we get that there is a countable field C j C such that, for any irreducible hypersurface V C 4 satisfying the conditions of Theorem 1.2, if 𝑉 is not definable over C j , then for any finitely generated subfield K C over which 𝑉 can be defined, there is a point of the form ( z , j ( z ) , j ( z ) , j ′′ ( z ) ) V such that tr . deg . K K ( z , j ( z ) , j ( z ) . j ′′ ( z ) ) = dim V = 3 .

To prove Theorem 1.2, we establish general criteria for the solvability of certain equations involving periodic functions (see Section 4). The following proposition is a special case of those criteria.

Definition 1.5

A meromorphic function f : H C is 1-periodic if f ( z + 1 ) = f ( z ) for every z H . Every such function induces a meromorphic function f ̃ ( q ) on the punctured unit disc by performing the change of variable q = exp ( 2 π i z ) . We say that 𝑓 is meromorphic at i if f ̃ is meromorphic at 0.

Proposition 1.6

Let f 0 , , f n : H C be 1-periodic functions which are meromorphic on H { i } . Suppose that, for some 𝑘, one of the following conditions is satisfied:

  • there is τ H such that f k ( z ) / f n ( z ) as z τ H , or

  • f k ( z ) / f n ( z ) as Im ( z ) + .

Then there is a sequence of points { z m } m N H with z m τ and z m τ in the first case, or Im ( z m ) + and 0 Re ( z m ) 2 in the second case, such that, for all sufficiently large 𝑚, the point z m + m is a solution to the equation

f n ( z ) z n + f n 1 ( z ) z n 1 + + f 0 ( z ) = 0 .

Let us consider an example illustrating how we apply Proposition 1.6 in practice. It also gives an idea of our approach in the general case.

Example 1.7

Consider the equation

(1.1) j ( z ) 2 + p ( j ( z ) ) = 0 ,

where either p ( j ( z ) ) = j ( z ) ( j ( z ) 1728 ) or p ( j ( z ) ) = j ( z ) 2 ( j ( z ) 1728 ) . First, we want to get an equivalent equation which is written as a sum of powers of 𝑧 with periodic coefficients. To that end, we apply the SL 2 ( Z ) -transformation z 1 z and, using the identities j ( 1 z ) = j ( z ) , j ( 1 z ) = z 2 j ( z ) , we get

(1.2) z 4 j ( z ) 2 + p ( j ( z ) ) = 0 .

Thus we obtain an equation in a suitable form for using Proposition 1.6, where f 4 = ( j ) 2 , f 3 = f 2 = f 1 = 0 and f 0 = p ( j ) . When p ( j ) = j ( j 1728 ) , the ratio

f 0 f 4 = j ( j 1728 ) ( j ) 2

has a pole at τ = ρ . When p ( j ) = j 2 ( j 1728 ) , the ratio

f 0 f 4 = j 2 ( j 1728 ) ( j ) 2

has no finite poles, but it has limit ∞ as Im ( z ) + .

Thus, by Proposition 1.6, there is a sequence z m with z m τ and z m τ in the first case, or Im ( z m ) + and 0 Re ( z m ) 2 in the second case, such that, for all sufficiently large 𝑚, the point z m + m is a solution to equation (1.2). This already implies that the solutions of (1.2) intersect infinitely many SL 2 ( Z ) -orbits.

To deduce Zariski density of these solutions, suppose that all of them are also solutions of another independent equation G ( z , j ( z ) , j ( z ) , j ′′ ( z ) ) = 0 . Combining this and (1.2), we can eliminate 𝑧 and end up with an equation H ( j , j , j ′′ ) = 0 . Now, our assumption means that H ( j , j , j ′′ ) vanishes at z m + m , hence also at z m by periodicity. This is not possible, for a 1-periodic holomorphic function, meromorphic at i , cannot have infinitely many zeroes with real part bounded from above and below and imaginary part bounded from below. This then implies the Zariski density of solutions of (1.1).

We also note that our criteria can be applied to more general periodic functions, beyond polynomials of j , j , j ′′ , such as exp or the Weierstrass ℘-function. For instance, Proposition 1.6 implies that the function j ( z ) z + exp ( 2 π i z ) has infinitely many zeroes around the points i + m , where 𝑚 is a large integer.

1.1 Structure of the paper

In Section 2, we go over some basic preliminaries about the 𝑗-function and its derivatives. We also give the definition of Zariski density used in Theorem 1.2. In Section 3, we prove Theorem 1.1 by extending the methods of [9], which are based on Rouché’s theorem. In Section 4, we prove criteria for the existence and distribution of solutions of equations involving periodic functions, which combined imply Proposition 1.6 (but are significantly more general). The approach used here involves Rouché’s theorem, the Argument Principle, and some elementary methods from valuation theory. These methods do not appear in later sections of the paper. In Section 5, we use the results of the previous section to obtain concrete criteria for proving Zariski density of equations of the form F ( z , j ( z ) , j ( z ) , j ′′ ( z ) ) = 0 . These criteria are about the presence of poles in quotients of certain polynomials only in j , j , j ′′ . In Section 6, we produce zero estimates for polynomials in 𝑧, 𝑗, j , j ′′ in order to determine when the quotients mentioned above have poles. In Section 7, we first prove Zariski density for 𝐣-homogeneous equations (Definition 7.1), which only involve j ( z ) , j ( z ) and j ′′ ( z ) , including the equation j ′′ ( z ) = 0 (Section 7.1). Finally, in Section 7.2, we prove Theorem 1.2 in full generality.

2 Preliminaries

Let ℍ denote the complex upper half-plane { z C : Im ( z ) > 0 } . The group GL 2 + ( R ) of 2 × 2 real invertible matrices with positive determinant acts on ℍ via linear fractional transformations

g z : = a z + b c z + d for g = ( a b c d ) GL 2 + ( R ) .

This action can be seen as a restriction of the action of GL 2 ( C ) on the Riemann sphere C { } . The modular group is defined as

SL 2 ( Z ) : = { ( a b c d ) GL 2 + ( R ) : a , b , c , d Z and a d b c = 1 } .

As a group, SL 2 ( Z ) is generated by two elements ( 1 1 0 1 ) and ( 0 1 1 0 ) , which correspond to the actions z z + 1 and z 1 z , respectively.

The modular 𝑗-function can be defined as the unique SL 2 ( Z ) -automorphic function j : H C satisfying j ( ρ ) = 0 (recall that ρ : = exp ( 2 π i 3 ) ; this notation will be kept throughout the paper), j ( i ) = 1728 , and j ( ) = (this last condition should be understood as lim Im ( z ) + j ( z ) = ). In particular, this means that 𝑗 satisfies j ( γ z ) = j ( z ) for every 𝛾 in SL 2 ( Z ) and every 𝑧 in ℍ, and it is in particular 1-periodic, and by assumption meromorphic at i . Its Fourier expansion (also known as a 𝑞-expansion) is of the form

(2.1) j ( z ) = q 1 + 744 + k = 1 a k q k , with q : = exp ( 2 π i z ) and a k C .

In fact, a k Z for every k N . The 𝑗-function induces an analytic isomorphism of Riemann surfaces SL 2 ( Z ) H C (see [13, Chapter 3, §3]).

Since 𝑗 is invariant under the action of SL 2 ( Z ) , we can study the behaviour of 𝑗 by looking at fundamental domains of the action of SL 2 ( Z ) on ℍ. The standard fundamental domain is the set

F : = { z C : 1 2 Re ( z ) < 1 2 , | z | 1 , ( | z | = 1 1 2 Re ( z ) 0 ) } .

We let F ̄ denote the Euclidean closure of 𝔽 (within the Riemann sphere). A diagram of the standard fundamental domain along with some of its SL 2 ( Z ) -translates is given in Figure 1. When we refer to a fundamental domain, we will always mean a set of the form γ F for some γ SL 2 ( Z ) .

Figure 1 
               The fundamental domains of the action by 
                     
                        
                           
                              
                                 SL
                                 2
                              
                              ⁡
                              
                                 (
                                 Z
                                 )
                              
                           
                        
                        
                        \operatorname{SL}_{2}(\mathbb{Z})
                     
                  , where 𝔽 is highlighted by the striped background.
Figure 1

The fundamental domains of the action by SL 2 ( Z ) , where 𝔽 is highlighted by the striped background.

It is well known that 𝑗 satisfies the following third-order differential equation (and none of lower order [15]):[2]

(2.2) 0 = j ′′′ j 3 2 ( j ′′ j ) 2 + j 2 1968 j + 2654208 2 j 2 ( j 1728 ) 2 ( j ) 2 .

This shows that the derivatives of 𝑗 of order at least 3 are rational over 𝑗, j , and j ′′ . Mahler’s result [15] implies that 𝑗, j , and j ′′ are algebraically independent over ℂ.

The functions 𝑗, j , and j ′′ are all 1-periodic and meromorphic at i , and by differentiating (2.1), we can obtain the following 𝑞-expansions of j and j ′′ :

j ( z ) = 2 π i q + 2 π i k 1 a k k q k , j ′′ ( z ) = 4 π 2 q 4 π 2 k 1 a k k 2 q k .

Observe that Q { } forms a single orbit under the action of SL 2 ( Z ) . We call these elements the cusps of 𝑗. Given a fundamental domain γ F , the cusp of γ F is the unique element of Q { } contained in the Euclidean closure of γ F (where the closure is taken in the Riemann sphere).

Using (2.1) and the 𝑞-expansions of j and j ′′ , it is easy to see that, for every x R , we have that each of the expressions j ( x + i y ) , j ( x + i y ) , and j ′′ ( x + i y ) grows exponentially to ∞ as y + . In this case, we sometimes write z i to emphasise that 𝑧 approaches ∞ by increasing its imaginary part, while the real part remains bounded. A similar behaviour takes place when 𝑧 approaches a rational number from within a fixed fundamental domain containing that rational number in its Euclidean closure. We will give a precise description of this behaviour in Section 4.

We finish this section with the definition of what we mean by finding a Zariski dense set of solutions to an equation.

Definition 2.1

Let F ( X , Y 0 , Y 1 , Y 2 ) be a polynomial over ℂ. We say the equation

F ( z , j ( z ) , j ( z ) , j ′′ ( z ) ) = 0

has a Zariski dense set of solutions if, for any polynomial G ( X , Y 0 , Y 1 , Y 2 ) which is not divisible by some irreducible factor of 𝐹, there is z 0 H such that F ( z 0 , j ( z 0 ) , j ( z 0 ) , j ′′ ( z 0 ) ) = 0 and G ( z 0 , j ( z 0 ) , j ( z 0 ) , j ′′ ( z 0 ) ) 0 .

Clearly, it suffices to prove Zariski density for irreducible polynomials to obtain Theorem 1.2, so from now on, we will reduce to the case where 𝐹 is irreducible.

3 Existence of solutions

We start with the Rouché method for proving the existence of solutions, but not yet their Zariski density. We first recall the crucial theorem.

Theorem 3.1

Theorem 3.1 (Rouché; see e.g. [14, Chapter VI, §1, Theorem 1.6])

Let f , g be meromorphic functions on a complex domain Ω. Let 𝐶 denote a simple closed curve which is homologous to 0 in Ω and such that 𝑓 has no zeroes or poles on 𝐶. If the inequality

| g ( z ) | < | f ( z ) |

holds for all 𝑧 on 𝐶, then the difference between the numbers of zeroes and poles in the interior of 𝐶 for the functions f + g and 𝑓 is the same.

It is well known that the Euclidean closure of the SL 2 ( Z ) -orbit of any point in ℍ (where the closure is taken within the Riemann sphere) only accumulates at the boundary of ℍ, that is, at R { } . The following lemma will help us choose convenient sequences within any given orbit converging to points in ℝ.

Lemma 3.2

Let z H and u R .

  1. If u R Q and the sequence γ k = ( a k b k c k d k ) SL 2 ( Z ) is such that γ k z u as k + , then | c k | + and a k c k u as k + .

  2. If u = a c Q with gcd ( a , c ) = 1 , then there is a sequence γ k = ( a b k c d k ) SL 2 ( Z ) such that | b k | , | d k | + and γ k z u as k + .

Proof

(i) We first show that | c k | + . Since every subsequence of γ k z tends to 𝑢, it suffices to show that c k is unbounded. Assume it is bounded; then we can choose a subsequence where the value of c k is constant. So assume c k = c is a constant sequence.

Conversely, assume now that γ k z u . Let z = x + i y . If d k is also bounded, then we may assume it is constant. Then a k z + b k = ( a k x + b k ) + a k y i must be convergent; hence a k must be convergent, and so constant. Then b k is also constant, for a k d k b k c k = 1 , a contradiction.

Thus we may assume | d k | + . Then

a k d k z + b k d k c d k z + 1 u .

Therefore,

a k d k z + b k d k = ( a k d k x + b k d k ) + a k d k y i u .

This implies

a k d k 0 , b k d k u .

On the other hand, a k d k b k c = 1 ; hence a k b k d k c = 1 d k 0 . Thus a k u c , which means a k = a Z is constant. But then u = a c Q .

Since | c k | + , then using that

a k z + b k c k z + d k c k a k = a k z + b k a k z + b k + 1 c k 1 ,

we see that γ k z and a k c k must have the same limit.

(ii) As u = a c with gcd ( a , c ) = 1 , there are integers m , l such that a m c l = 1 . Choose b k = l + k a , d k = m + k c . Then

lim k + a z + b k c z + d k = lim k + a d k z + b k d k c d k z + 1 = lim k + b k d k = lim k + l + a k m + c k = a c = u .

In order to ease notation, we will start using bold-faced letters to denote vectors, so we set Y : = ( Y 0 , Y 1 , Y 2 ) and j : = ( j , j , j ′′ ) : H C 3 .

We are now ready to prove Theorem 1.1 which we restate below for convenience.

Theorem 1.1

For every F ( X , Y ) C [ X , Y ] C [ X ] , the equation F ( z , j ( z ) ) = 0 has infinitely many solutions.

Proof

If 𝐹 does not depend on Y 1 and Y 2 , then we are done by the results of [9]. So assume 𝐹 depends on Y 1 or Y 2 . The argument below is a generalisation of the method of [9].

Let r ( X ) : = F ( X , 0 , 0 , 0 ) and G ( X , Y ) : = F ( X , Y ) + r ( X ) . Further, let

f ( z ) : = G ( z , j ( z ) ) .

Then we want to solve the equation

f ( z ) = r ( z ) .

Notice that f ( ρ ) = 0 , for j ( ρ ) = j ( ρ ) = j ′′ ( ρ ) = 0 . Let B C be a closed disc centred at 𝜌 with sufficiently small radius such that j ( z ) 0 and j ′′ ( z ) 0 for z B { ρ } .

Pick a point u = a c Q , and choose a sequence γ k SL 2 ( Z ) as in Lemma 3.2ii such that γ k z u as k + for any z H . Let B k : = γ k B . By compactness of 𝐵, the function r ( γ k z ) tends to r ( u ) uniformly for z B .

We have

f ( γ k z ) = G ( γ k z , j ( z ) , ( c z + d k ) 2 j ( z ) , ( c z + d k ) 4 j ′′ ( z ) + 2 c ( c z + d k ) 3 j ( z ) ) = G ( γ k z , j ( z ) , d k 2 ( c d k z + 1 ) 2 j ( z ) , d k 4 ( ( c d k z + 1 ) 4 j ′′ ( z ) + 2 c d k ( c d k z + 1 ) 3 j ( z ) ) ) .

Consider the polynomial G ( X , Y 0 , T 2 Y 1 , T 4 Y 2 ) as a polynomial of 𝑇. It clearly has positive degree, for otherwise 𝐺 (and hence 𝐹) would not depend on Y 1 nor Y 2 . Let its leading term be H ( X , Y ) T m . Since c d k 0 , we see that

f ( γ k z ) = d k m H ( u , j ( z ) ) + o ( d k m ) as k + .

We can now shrink 𝐵 to make sure that H ( u , j ( z ) ) 0 on B , so it is uniformly bounded away from 0 for z B . In particular, f ( γ k z ) approaches infinity as k + uniformly for z B . So, for sufficiently large 𝑘, the inequality | f ( z ) | > | r ( z ) | holds for all

z B k = ( γ k B ) = γ k B ,

and we can apply Rouché’s theorem to these functions. Since 𝑓 has a zero in B k , namely γ k ρ , so does f r . ∎

Remark 3.3

The following more general statement can be proven by the same argument. Let F ( X , Y ) C [ X , Y ] C [ X ] . Let U C be an open set such that U R and let f : U C be a holomorphic function. Then the equation F ( z , j ( z ) ) = f ( z ) has infinitely many solutions.

As mentioned in Section 1, the proof of Theorem 1.1 does not guarantee a Zariski dense set of solutions. For instance, if F ( X , 0 , 0 , 0 ) 0 , that is, 𝐹 has no term depending only on 𝑋, then the only solutions yielded by the method above are the SL 2 ( Z ) -conjugates of 𝜌. In order to establish Zariski density, we will look at a refinement of the procedure, where we transform the equation by convenient elements of SL 2 ( Z ) . This will be done starting in Section 6, but first, in the next section, we will develop some tools to study equations involving periodic functions.

4 Solvability of certain equations involving periodic functions

In this section, we establish some general criteria for the solvability of equations involving periodic functions and, in particular, prove Proposition 1.6. We remark that this section is independent in many ways from the rest of the paper as the results we prove make no reference to 𝑗 or its derivatives, and in particular, the methods developed here will not reappear in the following sections.

We recall that, given a meromorphic function 𝑓, not identically 0, and a point z 0 , the order of 𝑓 at z 0 is the unique integer 𝑛 such that ( z z 0 ) n f ( z ) is holomorphic and non-zero at z 0 .

Proposition 4.1

Let f 0 , , f n : H C be 1-periodic meromorphic functions and let be the minimum order of f k / f n at a fixed z 0 H for k = 0 , , n 1 .

If > 0 , then for any sufficiently small disc 𝐷 centred at z 0 and for every sufficiently large m Z , the equation

f n ( z ) z n + f n 1 ( z ) z n 1 + + f 0 ( z ) = 0

has ℓ solutions, counted with multiplicity, in m + ( D { z 0 } ) .

Proof

For simplicity, assume that f 0 / f n has a pole at z 0 H of order > 0 ; the same proof will work in the general case with trivial modifications.

Under the above assumptions, ( f n / f 0 ) ( z 0 ) = 0 , and moreover, ( f k / f 0 ) ( z 0 ) for all 𝑘. Let F ( z ) : = f n ( z ) z n + f n 1 ( z ) z n 1 + + f 0 ( z ) . Consider the functions

G ( z ) : = f n ( z ) f 0 ( z ) z n and H ( z ) : = f n 1 ( z ) f 0 ( z ) z n 1 + + f 1 ( z ) f 0 ( z ) z + 1 .

Pick a small closed disc 𝐷 centred at z 0 such that the f k ’s have no zeroes nor poles in D { z 0 } . Since f k ( z ) / f 0 ( z ) are periodic and bounded on 𝐷, for large enough 𝑚, we have

| G ( z + m ) | = | f n ( z ) f 0 ( z ) | | z + m | n > | H ( z + m ) | for z D .

By Rouché’s Theorem 3.1, the number of zeroes of the functions

G ( z + m ) and G ( z + m ) + H ( z + m ) = f 0 ( z ) 1 F ( z + m )

inside 𝐷 is the same. Since the former has a zero at z 0 of order ℓ and no other zero, the latter must also have ℓ zeroes in 𝐷, counted with multiplicity. Thus f 0 ( z ) 1 F ( z + m ) has ℓ zeroes in 𝐷, and so f 0 1 F has ℓ zeroes in m + D , counted with multiplicity.

Finally, note that G ( z 0 + m ) + H ( z 0 + m ) = 0 holds for at most n 1 values of 𝑚; thus, for 𝑚 sufficiently large, the above ℓ solutions in m + D are actually in m + ( D { z 0 } ) .

This finishes the proof of the proposition when f 0 / f n has a pole at z 0 of order greater than or equal to that of f k / f n for k = 1 , , n 1 . For when the maximum order of pole at z 0 is attained by f k / f n for some k 0 , simply divide by f k ( z ) rather than f 0 ( z ) when defining 𝐺 and 𝐻. ∎

Following the notation of the proposition, when the f k ’s are polynomials in 𝑗, j , j ′′ and some f k / f n has a pole in ℍ, the above proposition applies. Instead, when the functions f k / f n have no poles in ℍ, we will rely on the asymptotic behaviour of f k / f n towards the boundary of ℍ. Specifically, we will prove an analogue of Proposition 4.1 in the case where

f k ( z ) f n ( z ) as z i .

As the following example shows, we may also need to consider non-periodic functions which are asymptotically periodic. Dealing with those functions requires a considerably more sophisticated setup than the one in Proposition 1.6, so we first discuss the example in detail to clarify the choices made in the rest of this section.

Example 4.2

Consider the equation

( j ) 5 + j 2 ( j 1728 ) 2 ( j ′′ ) 2 + α j 2 ( j 1728 ) ( j ) 3 = 0 ,

where 𝛼 is to be determined later. In order to write this equation as a polynomial in 𝑧 with periodic coefficients, we apply the z 1 z transformation[3] and get

z 10 ( j ( z ) ) 5 + z 8 j ( z ) 2 ( j ( z ) 1728 ) 2 ( j ′′ ( z ) ) 2 + z 7 4 j ( z ) 2 ( j ( z ) 1728 ) 2 j ( z ) j ′′ ( z ) + z 6 ( 4 j ( z ) 2 ( j ( z ) 1728 ) 2 ( j ( z ) ) 2 + α j ( z ) 2 ( j ( z ) 1728 ) ( j ( z ) ) 3 ) = 0 .

In this example, the ratio of the coefficients of z 8 and z 10 actually has a pole at 𝑖. However, checking for poles among such ratios in a general equation requires sufficiently precise zero estimates at the conjugates of 𝜌 and 𝑖, which are hard to produce for polynomials involving j ′′ (see Example 6.7); on the other hand, we can provide sharp zero estimates for polynomials in 𝑗, j only (see Section 6). The latter estimates turn out to be enough: for instance, when the original equation does not contain 𝑧, after the z 1 z transformation, the coefficient of the lowest power of 𝑧 does not depend on j ′′ . This is exemplified here by the coefficient of z 6 . Hence our strategy hinges on the fact that the ratio between a particular coefficient not involving j ′′ , which we can procure in all cases, and the leading coefficient has a pole or exponential growth in some fundamental domain.

In this particular example, the ratio in question is between the coefficients of z 6 and z 10 , thus the function

f ( z ) : = 4 j 2 ( j 1728 ) 2 + α j 2 ( j 1728 ) j ( j ) 3 .

We claim that f ( z ) has no pole in ℍ. Indeed, easy calculations show that the orders of the numerator at 𝜌 and 𝑖 (and their SL 2 ( Z ) -orbits) are equal to 6 and 3 respectively. The denominator has the same orders at these points, so 𝑓 has no poles. Moreover, choosing α = 2 π i ensures the leading terms in the 𝑞-expansions of the two terms in the numerator cancel out. This then means that f ( z ) tends to a constant as z i . Therefore, Proposition 1.6 cannot be applied in this situation. However, f ( z ) has exponential growth as we approach 0 from within a fundamental domain with a cusp at 0 (in fact, we shall prove that f ( z ) must have exponential growth in most fundamental domains). Indeed, after applying the z 1 z transformation, we get

g ( z ) : = f ( 1 z ) = 4 j ( z ) 2 ( j ( z ) 1728 ) 2 + α z 2 j ( z ) 2 ( j ( z ) 1728 ) j ( z ) z 6 j ( z ) 3 ,

and because of the extra factor z 2 in the second summand in the numerator, no cancellation is possible, thus guaranteeing that g ( z ) grows exponentially as z i . Note however that this function is not periodic, but only asymptotically periodic in the sense that

lim z i g ( z + 1 ) g ( z ) = 1 .

This fact is responsible for the technicalities in the rest of this section.

It is worth mentioning that, for equations of the form F ( j , j ) = 0 , where 𝐹 is a polynomial, the transformation z 1 z always guarantees that the ratio of the lowest and highest powers of 𝑧 has a pole in ℍ or exponential growth at i . In general, Proposition 1.6 is sufficient when we deal with equations of the form F ( z , j ( z ) , j ( z ) ) = 0 , although the argument is somewhat more complicated.

The reader may benefit from revisiting this example after reading the rest of the paper, as it will make the above-mentioned phenomena less obscure.

Notation

Let 𝒫 denote the field of 1-periodic meromorphic functions on ℍ which are also meromorphic at i (recall Definition 1.5). We write P [ w ] and P ( w ) respectively for the polynomial ring and its fraction field generated by the variable 𝑤 over 𝒫. We remark that the functions in 𝒫 will also be thought of as meromorphic functions of the variable 𝑤.

Also, given an unbounded region U C and two meromorphic functions f , g on 𝑈, we write f g for w in 𝑈 to mean that the limit of the ratio f ( w ) / g ( w ) tends to 1 as 𝑤 approaches infinity from within 𝑈.

Lemma 4.3

Let f P ( w ) . Then there are α C × , e , d Z , and a positive C R such that f ( w ) α w d q e for w in the region Im ( w ) C log | w | , where q = exp ( 2 π i w ) .

Proof

It suffices to prove the conclusion for f P [ w ] . Write

f ( w ) = k = 0 n g k ( w ) w k , with g k P .

Each g k ( w ) has a meromorphic 𝑞-expansion g ̃ k ( q ) which converges on some neighbourhood of q = 0 . Let 𝑒 be the minimum order of g ̃ k ( q ) at q = 0 for k = 0 , , n . Let α k C be such that g ̃ k ( q ) = q e ( α k + O ( q ) ) . Let 𝑑 be the maximum 𝑘 such that g ̃ k ( q ) has order 𝑒 at q = 0 , namely such that α k 0 . In the region | q | | w | ( n + 1 ) , we have O ( q ) = O ( w ( n + 1 ) ) , in which case

G ( w , q ) = k = 0 n g ̃ k ( q ) w k = q e w d k = 0 n ( α k + O ( w ( n + 1 ) ) ) w k d = α d q e w d ( 1 + O ( w 1 ) ) .

It now suffices to specialise at q = exp ( 2 π i w ) and observe that | q | = e 2 π Im w | w | ( n + 1 ) if and only if Im ( w ) n + 1 2 π log | w | . ∎

It follows at once that all functions in P ( w ) are “asymptotically periodic” in the sense that f ( w + 1 ) f ( w ) for w in the above region. Moreover, the lemma allows us to make the following definition.

Definition 4.4

Call the order at i of f P ( w ) , written ord w = i ( f ) , the pair ( e , d ) Z 2 of integers such that, for some α C and some C R , we have

f ( w ) α w d exp ( 2 π i e w ) for w

in the region Im ( w ) C log | w | .

We say that 𝑓 has exponential growth at i if its order is ( e , d ) with e < 0 .

Here we consider Z 2 as a lexicographically ordered group. This makes ( P ( w ) , ord w = i ) into a valued field, and we have for instance f ( w ) in a suitable region Im ( w ) C log | w | if and only if ord w = i f ( w ) < ( 0 , 0 ) .

We can now set up a generalisation of Proposition 4.1 that will cover our application to 𝑗. Let us fix the following data:

  • a polynomial F ( z , w ) = k = 0 n z k f k ( w ) , where each f k is in P ( w ) and f n 0 ;

  • a value 𝑠 which is either 0 or 1.

We look for the zeroes of functions of the form F r ( w ) : = F ( r + s w , w ) for 𝑟 varying among the real numbers. For each 𝑟 sufficiently large, we pick a suitable rectangle Ξ r , pictured in Figure 2 and defined in Proposition 4.8, and integrate the logarithmic derivative F r / F r along the boundary of Ξ r . Provided that F r does neither have zeroes nor poles on such a boundary, by the Argument Principle (see Theorem 4.7), the value of the integral counts the difference between the number of zeroes and poles, with multiplicity, inside Ξ r . We will choose Ξ r so that the integral has positive value.

We first parametrise the roots of F ( z , w ) as a polynomial in 𝑧 in terms of 𝑤 varying in a suitable region. The resulting functions, which by construction are algebraic over P ( w ) , admit an order at i which may be a pair of rational numbers, rather than only integers.

Lemma 4.5

There is a positive C R such that, in the region

U = { w H : Im ( w ) C log | w | } ,

there are holomorphic functions β 1 , , β n : U C such that F ( β k ( w ) , w ) = 0 for all w U , and if F ( β , w ) = 0 , then β = β k ( w ) for some 𝑘.

Moreover, there are α k C × , e k , d k Q such that β k ( w ) α k w d k q e k for w in 𝑈, where q = exp ( 2 π i w ) and w d k = exp ( 2 π i d k log ( w ) ) for some holomorphic branch of log ( w ) on 𝑈.

Proof

It suffices to prove the conclusion for 𝐹 irreducible as a polynomial over P ( w ) .

Let F ( z ) = F / z . Then there are G , H P ( w ) [ z ] such that G F + H F ( z ) = 1 . We take 𝐶 large enough that, by Lemma 4.3, the coefficients of 𝐹, F ( z ) , 𝐺, and 𝐻 are holomorphic in the region U = { w H : Im ( w ) C log | w | } . In particular, F ( z , w 0 ) and F ( z ) ( z , w 0 ) have no common roots for any w 0 U , and so, by the implicit function theorem, and because 𝑈 is simply connected, there are 𝑚 holomorphic functions β 1 , , β m : U C such that F ( β t ( w ) , w ) = 0 for every 𝑡 and w U , and moreover taking distinct values at all w U ; thus if F ( β , w ) = 0 , then β = β k ( w ) for some 𝑘.

Now fix some 𝑘. For every w U , there is some 𝑡 such that

| β k ( w ) t f t ( w ) | | β k ( w ) f ( w ) | for every ,

and since F ( β k ( w ) , w ) = 0 , there is also h t such that | β k ( w ) h f h ( w ) | 1 n | β k ( w ) t f t ( w ) | . Let U t , h be the region where those inequalities hold. We have in particular

1 | β k ( w ) | | f h ( w ) f t ( w ) | h t 1 n h t

Let

( e k , d k ) = ord w = i ( f t / f h ) h t .

By Lemma 4.3 combined with the above inequalities, there is N > 1 such that, whenever 𝑤 is sufficiently large in U t , h , we have

N | β k ( w ) | | w d k q e k | 1 N

for some fixed determination of w d k on 𝑈. Choose 𝑁 so that it works simultaneously for any possible pair t , h . By continuity of β k , the numbers d k , e k do not depend on t , h .

Let 𝑆 be the set of indices 𝑡 such that ord w = i ( f t ) + ( t e k , t d k ) reaches a minimum value ( e , d ) . By construction, 𝑆 contains at least two elements. Write

F ( w d k q e k z , w ) = q e w d ( G 0 ( z ) + G 1 ( z , w ) ) ,

where now G 0 is a non-trivial polynomial in z with | S | 2 terms and constant coefficients, and G 1 has coefficients that tend to 0 for w in U t , h . Since

β k ( w ) w d k q e k

is bounded and continuous, it must converge to a non-zero root α k of G 0 ( z ) ; thus we have β k ( w ) α k w d k q e k , as desired. ∎

We now provide an estimate on the size of F r ( w ) that we can use on the boundary of Ξ r .

Lemma 4.6

There exist 0 x 0 < 1 , C , y 0 4 , D , E 0 , E 1 0 (with E 0 possibly + ), δ > 0 such that

| F r ( w ) | = | F ( r + s w , w ) | > δ k = 0 n 1 | f k ( w ) | | r + s w | k

for all w H , r R such that Im ( w ) y 0 , | r | Im ( w ) D , and one of the following holds:

  • 0 Re ( w ) 2 , Im ( w ) E 0 log | r | , or

  • 0 Re ( w ) 2 , E 1 log | r | Im ( w ) , or

  • Re ( w ) { x 0 , x 0 + 1 } .

Proof

We work in a region U = { w H : 0 Re ( w ) 2 Im ( w ) y 0 } for some y 0 sufficiently large as determined by this proof. We start by taking y 0 large enough that, by Lemma 4.3, f n has neither zeroes nor poles at 𝑤. We also require that | r | C for some 𝐶 sufficiently large, again as determined by this proof.

By Lemma 4.5, provided y 0 is sufficiently large, there are holomorphic functions

β 1 , , β n : U C

parametrising the roots of F ( z , w ) = 0 as functions of 𝑤, and for w in 𝑈, we have

β k α k w d k q e k for some α k C × and d k , e k Q .

Since we are assuming 0 Re ( w ) 2 , we also have Im ( w ) | w | Im ( w ) + 2 . We require that C 4 , y 0 4 , so that we have the following simple inequalities:

| r + s w | max { | r | 2 , s Im ( w ) } max { | r | 2 , s | w | 2 } | r + s w | 4 .

It follows, for instance, that | w | Im ( w ) + for w in 𝑈. We shall omit the specification “in 𝑈” in the rest of this proof.

We now bound | r + s w β k ( w ) | , distinguishing multiple cases depending on ( e k , d k ) .

If ( e k , d k ) ( 0 , 0 ) , then β k ( w ) converges to a finite value γ k for w . In this case, we require 𝐶 (if s = 0 ) and y 0 to be large enough that

| r + s w | 4 | γ k | 2 | β k ( w ) | .

This ensures that

| r + s w β k ( w ) | | r + s w | 2 | r + s w | + | β k ( w ) | 4 .

Suppose that e k = 0 and d k < 0 . For | r | Im ( w ) 2 max { 1 , d k } , we have both | r | 4 | s w | and | r | 4 | β k ( w ) | for 𝑤 sufficiently large, and so, for y 0 large, we get

| r + s w β k ( w ) | | r | 2 | r + s w | + | β k ( w ) | 8 .

If e k < 0 , then s w β k ( w ) β k ( w ) for w . Since Re ( w ) is bounded, we get

Re ( log ( s w β k ( w ) ) ) = log | s w β k ( w ) | log | β k ( w ) | 2 π e k Im ( w ) .

We first give bounds when | r | is roughly at least | β k | 2 , and when | β k | is roughly at least | r | 2 . More precisely, for y 0 sufficiently large, we have

| r + s w β k ( w ) | > | r | 2 | r + s w | + | β k ( w ) | 8 for | r | e 4 π e k Im ( w ) , | r + s w β k ( w ) | > | β k ( w ) | 2 | r + s w | + | β k ( w ) | 8 for | r | e π e k Im ( w ) .

For w , we also have

Im ( log ( s w β k ( w ) ) ) arg ( α k ) + π + 2 π e k Re ( w ) mod 2 π .

Now choose x 0 so that arg ( α k ) + 2 π e k x 0 and arg ( α k ) + 2 π e k ( x 0 + 1 ) are not in π Z , and so are different from arg ( r ) π Z . Note that we can do this simultaneously for all 𝑘 such that e k < 0 . In particular, there is δ k > 0 such that, after taking y 0 sufficiently large, we have

| r + s w β k ( w ) | > δ k ( | r + s w | + | β k ( w ) | ) for Re ( w ) { x 0 , x 0 + 1 } .

Now, if e k = 0 for some 𝑘, let 𝐷 be the maximum between the values 2 d k and 2 for such 𝑘; otherwise, we can take D = 0 . If e k < 0 for some 𝑘, let E 0 be the maximum of 4 π e k and let E 1 be the minimum of π e k for such 𝑘; otherwise, let E 0 = + and E 1 = 0 . Under the above choices, there is δ > 0 such that

| F r ( w ) | > δ | f n ( w ) | k ( | r + s w | + | β k ( w ) | ) δ k = 0 n | f k ( w ) | | r + s w | k

for any w H , r R satisfying the requirements in the conclusion. ∎

We now recall the Argument Principle, which plays a key role in the proof of Proposition 4.8.

Theorem 4.7

Theorem 4.7 (Argument Principle; see e.g. [14, Chapter VI, §1, Theorem 1.5])

Let 𝑓 be a meromorphic function on a complex domain Ω. Let 𝐶 be a simple closed curve (positively oriented) which is homologous to 0 in Ω and such that 𝑓 has no zeroes or poles on 𝐶. Let 𝑍 and 𝑃 respectively denote the number of zeroes and poles (counted with multiplicity) of 𝑓 in the interior of 𝐶. Then

2 π i ( Z P ) = C f ( z ) f ( z ) d z = f C d z z .

Figure 2 
               The region 
                     
                        
                           
                              Ξ
                              r
                           
                        
                        
                        \Xi_{r}
                     
                   highlighted by the striped background.
Figure 2

The region Ξ r highlighted by the striped background.

In the proof of the following proposition, we will integrate F r / F r along the boundary of Ξ r and use the above estimates to find a positive lower bound, thus proving the existence of zeroes of F r within Ξ r . For a more geometric description, integrating F r / F r computes how many times the image F r ( z ) winds around 0 while 𝑧 moves along Ξ r . The bounds below will determine a rough picture of F r ( Ξ r ) , as in Figure 3, and in turn determine the number of zeroes, counted with multiplicity.

Figure 3 
               Visual representation of the action of 
                     
                        
                           
                              F
                              r
                           
                        
                        
                        F_{r}
                     
                   on the boundary of a typical rectangle 
                     
                        
                           
                              Ξ
                              r
                           
                        
                        
                        \Xi_{r}
                     
                   for 𝐹 of the form 
                     
                        
                           
                              
                                 A
                                 ⁢
                                 
                                    z
                                    2
                                 
                              
                              +
                              
                                 B
                                 ⁢
                                 z
                              
                              +
                              
                                 C
                                 ⁢
                                 j
                                 ⁢
                                 
                                    
                                       (
                                       z
                                       )
                                    
                                    2
                                 
                              
                              +
                              
                                 D
                                 ⁢
                                 j
                                 ⁢
                                 
                                    (
                                    z
                                    )
                                 
                              
                              +
                              E
                           
                        
                        
                        Az^{2}+Bz+C{j(z)}^{2}+Dj(z)+E
                     
                  .
The term 
                     
                        
                           
                              j
                              2
                           
                        
                        
                        j^{2}
                     
                   has lowest order 
                     
                        
                           
                              (
                              
                                 −
                                 2
                              
                              ,
                              0
                              )
                           
                        
                        
                        (-2,0)
                     
                  , so 
                     
                        
                           
                              F
                              r
                           
                        
                        
                        F_{r}
                     
                   winds around 0 twice while following the top side of the rectangle.
Figure 3

Visual representation of the action of F r on the boundary of a typical rectangle Ξ r for 𝐹 of the form A z 2 + B z + C j ( z ) 2 + D j ( z ) + E . The term j 2 has lowest order ( 2 , 0 ) , so F r winds around 0 twice while following the top side of the rectangle.

Proposition 4.8

Let ( e , d ) be the minimum order of f k / f n at i for k = 0 , , n 1 and suppose that e < 0 . We work under the notation of Lemma 4.6. Then, for all r R sufficiently large, the function F ( r + s w , w ) has e zeroes, counted with multiplicity, within the region (Figure 2)

Ξ r = { w H : x 0 < Re ( w ) < x 0 + 1 , E 0 log | r | < Im ( w ) < | r | 1 M } ,

where M = D if D > 0 and M = 1 otherwise. Moreover, for w Ξ r , we have

| F ( r + s w , w ) | δ k = 0 n | f k ( w ) | | r + s w | k .

Proof

Recall that F r ( w ) = F ( r + s w , w ) . First, we apply Lemma 4.6 to F r ( w ) and find relevant constants x 0 , y 0 , 𝛿, 𝐶, 𝐷, E 0 , E 1 . Let M = D if D > 0 and M = 1 otherwise. Let 𝑟 be large enough so that | r | C , E 0 log | r | y 0 , and E 1 log | r | | r | 1 M ; hence

| F r ( w ) | δ k = 0 n | f k ( w ) | | r + s w | k

for w Ξ r . It follows, for instance, that F r ( w ) has order ( e , d ) at i . We also take 𝑟 sufficiently large so that each f k ( w ) does not have poles in Ξ r , so in particular F r ( w ) is holomorphic on Ξ r .

Note that F r ( w ) is never zero on the boundary of Ξ r ; thus its logarithmic derivative F r / F r is holomorphic there. We shall now compute the integral of F r / F r of F r along such a boundary.

Vertical sides. We show that the images of the vertical sides of Ξ r under F r must be close to each other, and so their contributions cancel out as they are taken with opposite orientations. For instance, if F r happens to be 1-periodic (as a function of 𝑤, for instance, when s = 0 and the coefficients of 𝐹 are 1-periodic), then the images of these vertical sides are identical.

First, we observe that

F r ( w + 1 ) F r ( w ) = k = 0 n f k ( w ) ( r + s w ) k ( f k ( w + 1 ) f k ( w ) ( r + s w + 1 r + s w ) k 1 ) .

Since f k P ( w ) , we have that

f k ( w + 1 ) f k ( w ) as Im ( w ) + .

Likewise, r + s w + 1 r + s w for r + s w . Therefore, we can choose 𝑟 large enough so that the last factor on the right-hand side has modulus less than δ 2 for all 𝑘 and for any 𝑤 on the boundary of Ξ. We then have

(4.1) | F r ( w + 1 ) F r ( w ) 1 | = | F r ( w + 1 ) F r ( w ) F r ( w ) | δ 2 | F r ( w ) | ( k = 0 n | f k ( w ) | | r + s w | k ) < 1 2

for w Ξ . We may now choose a branch of log in the disc around 1 of radius 1 2 and estimate the integral along the vertical sides as[4]

| E 0 log | r | | r | 1 M ( F r ( x 0 + 1 + i y ) F r ( x 0 + 1 + i y ) F r ( x 0 + i y ) F r ( x 0 + i y ) ) d y | = | log ( F r ( w + 1 ) F r ( w ) ) | w = x 0 + i E 0 log | r | w = x 0 + i | r | 1 M | < log ( 3 2 ) log ( 1 2 ) = log ( 3 ) < 2 .

Bottom side. We now show that the image of the bottom side is away from 0 and cannot wind much. Using (4.1), for 𝑟 sufficiently large,

| x 0 x 0 + 1 F r ( x + i E 0 log | r | ) F r ( x + i E 0 log | r | ) d x | = | log ( F r ( x + i E 0 log | r | + 1 ) F r ( x + i E 0 log | r | ) ) | < log ( 3 2 ) < 1 .

Top side. We show that F r behaves like exp ( 2 π i e w ) on the top side of Ξ r , and so the image under F r is roughly a circle traversed approximately 𝑒 times.

We now constraint 𝑤 to the region Im ( w ) = | r | 1 M , x 0 Re ( w ) x 0 + 1 . We have

δ max k | f k ( w ) | | r + s w | k | F r ( w ) | n max k | f k ( w ) | | r + s w | k .

By construction, r ζ w M for some power 𝜁 of 𝑖 depending on 𝑀 and the sign of 𝑟. For simplicity, fix the sign of 𝑟, and assume that ζ = 1 , so that r w M . Then

F ( w M + s w , w ) F r ( w ) = k = 0 n f k ( w ) ( r + s w ) k ( ( w M + s w r + s w ) k 1 ) .

In particular, by Lemma 4.6, we find that F ( w M + s w , w ) F r ( w ) = o ( F r ( w ) ) for r + . Since F ( w M + s w , w ) is in P ( w ) , it has an order ( e , d ) at i , and in fact, e = e because the term w M cannot alter the exponential growth.

Therefore, we find that there is α C × such that, for 𝑟 large enough,

| F r ( w ) α w d exp ( 2 π i e w ) 1 | < 1 4 .

Observe that, for 𝑟 large enough, we have

| x 0 x 0 + 1 ( ( x + i | r | 1 M ) d exp ( 2 π i e ( x + i | r | 1 M ) ) ) ( x + i | r | 1 M ) d exp ( 2 π i e ( x + i | r | 1 M ) ) d x 2 π i e | = | x 0 x 0 + 1 d x + i | r | 1 M d x | < 1 .

Thus, for sufficiently large 𝑟, we have

| x 0 x 0 + 1 F r ( x + i | r | 1 N ) F r ( x + i | r | 1 N ) d x 2 π i e | | log ( F r ( w ) α w d exp ( 2 π i e w ) ) | w = x 0 + i | r | 1 N w = x 0 + 1 + i | r | 1 N | + 1 < 2 .

Conclusion. Using the above estimates, we can now find the winding number of F r ( Ξ r ) at 0. Summing up the contributions from all sides, with the appropriate orientations, we obtain

| 1 2 π i Ξ r F r ( w ) F r ( w ) d w + e | < 1 2 π ( 2 + 1 + 2 ) < 1 .

By the Argument Principle, the integral on the left-hand side must be the difference between the number of zeroes and poles of F r ( w ) inside Ξ (in particular an integer) multiplied by 2 π i . Since F r ( w ) is holomorphic on Ξ r , it must have e zeroes in Ξ r , counted with multiplicity. ∎

Proof of Proposition 1.6

This follows from combining Propositions 4.1 and 4.8, where we use r = m a large integer and s = 1 . Note that if f k ( z ) / f n ( z ) as z i , then f k ( z ) / f n ( z ) must have exponential growth at i , for it is periodic and so has a 𝑞-expansion. ∎

5 Some criteria for Zariski density

Recall that Y : = ( Y 0 , Y 1 , Y 2 ) and j : = ( j , j , j ′′ ) : H C 3 .

5.1 Generic transforms

Given p C [ X , Y ] , or more generally p K [ X , Y ] for some field 𝐾, we define the generic SL 2 ( Z ) -transform of 𝑝 as the polynomial Γ ( p ) K [ Z , W , C , Y ] given by

Γ ( p ) ( Z , W , C , Y ) : = p ( Z , Y 0 , W 2 Y 1 , W 4 Y 2 + 2 C W 3 Y 1 ) .

In particular, we have deg X ( p ) = deg Z ( Γ ( p ) ) . By construction, for any γ = ( a b c d ) SL 2 ( Z ) , if n = deg X ( p ) , we have

p ( γ z , j ( γ z ) ) = p γ ( z , j ( z ) ) ( c z + d ) n , where p γ ( X , Y ) : = ( c X + d ) n Γ ( p ) ( a X + b c X + d , c X + d , c , Y ) .

Note that p γ K [ X , Y ] .

We make the following observations.

  1. The map Γ : K [ X , Y ] K [ Z , W , C , Y ] defined above is a 𝐾-algebra homomorphism with left inverse p ( X , Y ) = Γ ( p ) ( X , 1 , 0 , Y ) .

  2. For each γ SL 2 ( Z ) , the map p p γ is multiplicative, that is, ( p 1 p 2 ) γ = p 1 γ p 2 γ for any p 1 , p 2 K [ X , Y ] . Indeed, this follows at once from the fact that Γ is a homomorphism 1 and deg X ( p 1 p 2 ) = deg X ( p 1 ) + deg X ( p 2 ) .

  3. For any p K [ X , Y ] and any γ 1 , γ 2 SL 2 ( Z ) , there is r Z [ X ] such that

    ( p γ 1 ) γ 2 = r ( X ) p γ 1 γ 2 .

    Indeed, let n = deg X ( p ) and m = deg X ( p γ 1 ) . By construction, m = n + deg Y 0 ( p ) n . Now write

    γ t = ( a t b t c t d t ) for t { 1 , 2 } , and γ 1 γ 2 = ( a ̃ b ̃ c ̃ d ̃ ) .

    Thus

    p γ 1 γ 2 ( z , j ( z ) ) ( c ̃ z + d ̃ ) n = p ( γ 1 γ 2 z , j ( γ 1 γ 2 z ) ) = p γ 1 ( γ 2 z , j ( γ 2 z ) ) ( c 1 γ 2 z + d 1 ) n = ( p γ 1 ) γ 2 ( z , j ( z ) ) ( c 1 γ 2 z + d 1 ) n ( c 2 z + d 2 ) m .

    Since m n and ( c 1 γ 2 z + d 1 ) ( c 2 z + d 2 ) = c ̃ z + d ̃ , we get

    r ( X ) : = ( c 2 X + d 2 ) m ( c ̃ X + d ̃ ) n ( c 1 a 2 X + b 2 c 2 X + d 2 + d 2 ) n = ( c 2 X + d 2 ) m n Z [ X ] .

  4. For any p K [ X , Y ] and any γ SL 2 ( Z ) , if we consider 𝑝 and p γ as polynomials in the variables 𝐘 with coefficients in K ( X ) , then 𝑝 is irreducible in K ( X ) [ Y ] if and only if so is p γ .

    Indeed, note that if 𝑝 is not a unit (meaning it contains one of the variables Y 0 , Y 1 , Y 2 ), then p γ is also not a unit. It follows by 2 that if 𝑝 is reducible in K ( X ) [ Y ] , thus a product of two non-units, then so is p γ . Likewise, if p γ is reducible, then ( p γ ) γ 1 is reducible too, and ( p γ ) γ 1 = r ( X ) p for some r ( X ) C [ X ] by 3; since r ( X ) is a unit, it follows that 𝑝 is reducible.

Proposition 5.1

Let 𝑝 be an irreducible polynomial in C [ X , Y ] C [ X ] and γ SL 2 ( Z ) . Let ℎ be the irreducible factor of p γ that is not in C [ X ] . Then the equation p ( z , j ( z ) ) = 0 has a Zariski dense set of solutions if and only if the equation h ( z , j ( z ) ) = 0 has a Zariski dense set of solutions.

Proof

By 3 and 4, p γ = r ( X ) h and h γ 1 = s ( X ) p for some r , s C [ X ] . It follows that p ( z , j ( z ) ) = 0 and h ( z , j ( z ) ) = 0 have the same solutions except possibly for the zeroes of r ( z ) and s ( z ) . Since those are only finitely many, the solutions of the former equation are Zariski dense if and only if so are the solutions of the latter. ∎

5.2 Density criteria

We now apply the results of Section 4 to establish some useful criteria for Zariski density of solutions of equations involving z , j ( z ) , j ( z ) , j ′′ ( z ) .

Definition 5.2

Given a function g C ( z , j ( z ) ) and γ SL 2 ( Z ) , we say that g ( z ) has exponential growth in γ F if g ( γ 1 z ) has exponential growth at i . Furthermore, if 𝑟 is the cusp of γ F (that is, r Q { } is in the Euclidean closure of γ F ), then we define the order of g ( z ) in γ F at 𝑟 as ord z = i ( g ( γ 1 z ) ) .

Proposition 5.3

Let F ( X , Y ) = k = 0 n X k p k ( Y ) be a polynomial. Assume that, for some 𝑘 and some γ SL 2 ( Z ) , the function p k ( j ( z ) ) / p n ( j ( z ) ) has exponential growth in γ F . Then there are > 0 , 0 x 0 < 1 , M > 0 , E 0 > 0 such that, for all m Z sufficiently large, the function F ( z , j ( z ) ) has ℓ zeroes, counted with multiplicity, within the region m + γ Ξ m , where

Ξ m = { z H : x 0 < Re ( z ) < x 0 + 1 , E 0 log | m | < Im ( z ) < | m | 1 M } .

Proof

Fix some s { 0 , 1 } , t R to be determined later. For m Z , let

F m ( z ) : = F ( m + t + s z , j ( γ z ) ) .

Likewise, set G ( z ) : = F ( z , j ( z ) ) . By Proposition 4.8 applied to F ( z , j ( γ z ) ) , F m ( z ) has > 0 zeroes in a certain region Ξ m and is suitably bounded from below for z Ξ m , as long as 𝑚 is sufficiently large.

If 𝛾 is upper triangular, namely γ z = z + k for some 𝑘, we choose s = 1 , t = 0 , and observe that, since the functions of 𝐣 are 1-periodic, G ( z + m ) = F ( z + m , j ( γ z ) ) = F m ( z ) ; thus G ( z ) has ℓ zeroes in each region m + Ξ m .

Otherwise, let s = 0 and let 𝑡 be the limit of γ z as z (where in fact t Q ). In particular,

G ( m + γ z ) F m ( z ) = F ( m + γ z , j ( γ z ) ) F ( m + t , j ( γ z ) ) = k = 0 n p k ( j ( γ z ) ) ( m + t ) k ( ( m + γ z m + t ) k 1 ) .

Thus, as soon as 𝑧 is sufficiently large, the last factor on the right-hand side has modulus less than 1 2 independently of 𝑚. Then pick 𝑚 large enough so that this happens whenever Im ( z ) > E 0 log | m | , and so

| G ( m + γ z ) F m ( z ) | < 1 2 | F m ( z ) | .

Therefore, by Rouché’s Theorem 3.1, G ( m + γ z ) and F m ( z ) have the same number of zeroes in Ξ m , counted with multiplicity. It follows that G ( z ) has ℓ zeroes in the region m + γ Ξ m . ∎

Proposition 5.4

Let F ( X , Y ) = k = 0 n X k p k ( Y ) be irreducible. Suppose for some 𝑘 the function P ( z ) = p k ( j ( z ) ) / p n ( j ( z ) ) satisfies one of the following:

  1. P ( z ) has a pole in ℍ, or

  2. P ( z ) has exponential growth in some fundamental domain.

Then the equation F ( z , j ( z ) ) = 0 has a Zariski dense set of solutions.

Proof

Suppose by contradiction that all the solutions of F ( z , j ( z ) ) = 0 lie on a further hypersurface G = 0 , where G C [ X , Y ] is a non-constant polynomial not divisible by 𝐹. In particular, the algebraic subset of C 4 defined by { F = G = 0 } has dimension two, so its projection onto the variables Y 0 , Y 1 , Y 2 has dimension at most two, meaning that the solutions satisfy an equation H ( j ( z ) ) = 0 for some non-constant polynomial H C [ Y ] . The assumption on 𝐹 implies that 𝐹 depends on the variable 𝑋 (i.e. n 1 ); thus 𝐹 and 𝐻 are coprime.

By Propositions 4.1 and 5.3, for m Z large, there are regions Ξ m such that the original equation has solutions in m + Ξ m , and moreover the real part of each Ξ m is bounded from above and below and the imaginary part is bounded away from 0 uniformly in 𝑚. Each solution can be in m + Ξ m for at most finitely many m Z . This implies that, for some 𝑚 sufficiently large, the function H ( j ( z ) ) has infinitely many zeroes in the region | k | > | m | Ξ k . If this union is bounded, then we conclude that 𝐻 is constantly zero by the identity theorem from complex analysis, but this contradicts the algebraic independence of 𝑗, j , j ′′ . So we assume that the union is unbounded, but by Lemma 4.3, there exist α C × , d , e Z , and C > 0 such that H ( z ) α z d exp ( e 2 π i z ) in the region U : = { z H : Im ( z ) C log | z | } . Then the only way 𝐻 can have infinitely many zeroes in | k | > | m | Ξ k is if 𝐻 is constantly zero, again contradicting the algebraic independence of 𝑗, j , j ′′ . This completes the proof. ∎

Corollary 5.5

Let F ( X , Y ) = k = 0 n X k p k ( Y ) be irreducible. Suppose that p n has a factor ℎ such that the equation h ( j ( z ) ) = 0 has a Zariski dense set of solutions. Then the equation F ( z , j ( z ) ) = 0 has a Zariski dense set of solutions.

Proof

If n = 0 , then F = p 0 which, by irreducibility, is equal to a constant multiple of ℎ, and so the result is immediate. If instead n > 0 , then by irreducibility of 𝐹 for some k { 0 , , n 1 } , p k is non-zero and not divisible by ℎ. Then p k ( j ( z ) ) / p n ( j ( z ) ) will have poles in ℍ at those solutions of h ( j ( z ) ) = 0 which satisfy p k ( j ( z ) ) 0 (which exist since we are assuming Zariski density of the solutions of h ( j ( z ) ) = 0 ). So now the corollary follows from Proposition 5.4. ∎

6 Zero estimates

In view of the results in the previous section, we will now look at the poles of quotients of the form p k ( j ) / p n ( j ) , with p k , p n C [ Y ] . We keep using the notation introduced in Section 5.1.

Given p C [ X , Y ] , we will prove a few estimates on the order of p ( z , j ( z ) ) at different points, distinguishing three cases. Before that, we note that specialising the variables Y 1 and Y 2 of Γ ( p ) at some complex values will almost always return an “obfuscated” copy of the original polynomial, which for instance cannot be constant unless 𝑝 itself was. More precisely, we note the following trivial identity.

Lemma 6.1

For every α , β C with α 0 , we have

Γ ( p ) ( X , α 1 U 1 , α U 2 α 4 β U 1 4 2 U 1 3 , Y 0 , α 2 , β ) = p ( X , Y 0 , U 1 2 , U 2 ) .

Proof

Immediate. ∎

First, we consider unramified points of 𝑗, that is, points τ H such that j ( τ ) 0 . If ( Y 0 j ( τ ) ) s divides 𝑝, then p ( z , j ( z ) ) has obviously order at least 𝑠 at all conjugates of 𝜏. We show that the order is exactly 𝑠 for most conjugates.

Proposition 6.2

Let p C [ X , Y ] be non-zero, τ H . Suppose that j ( τ ) 0 and let 𝑠 be the maximum integer such that ( Y 0 j ( τ ) ) s divides 𝑝. Then, for all 𝛾 in a Zariski open dense subset of SL 2 ( Z ) , the function p ( z , j ( z ) ) has order 𝑠 at z = γ τ .

Proof

Let f C [ X , Y ] be such that p = ( Y 0 j ( τ ) ) s f . Then we need to show that f ( z , j ( z ) ) has order 0 at z = γ τ , for all 𝛾 in a Zariski open dense subset of SL 2 ( Z ) . In other words, it suffices to prove the proposition for the case when 𝑝 is not divisible by Y 0 j ( τ ) (and hence s = 0 ).

From now on, we assume s = 0 . By Lemma 6.1, at α 2 = j ( τ ) 0 , β = j ′′ ( τ ) , U 1 2 = Y 1 , U 2 = Y 2 , there are V 1 , V 2 C ( U 1 , U 2 ) such that Γ ( p ) ( X , V 1 , V 2 , j ( τ ) ) = p ( X , j ( τ ) , Y 1 , Y 2 ) . Since 𝑝 is not divisible by Y 0 j ( τ ) , p ( X , j ( τ ) , Y 1 , Y 2 ) is a non-zero polynomial; hence, in particular, r ( Z , W , C ) : = Γ ( p ) ( Z , W , C , j ( τ ) ) is also non-zero.

Now take γ = ( a b c d ) SL 2 ( Z ) and write

p γ ( τ , j ( τ ) ) = ( c τ + d ) n r ( γ τ , c τ + d , c ) = ( c τ + d ) n p ( γ τ , j ( γ τ ) ) ,

where n = deg X ( p ) . In particular, p ( γ τ , j ( γ τ ) ) = 0 if and only if r ( γ τ , c τ + d , c ) = 0 .

The map υ : SL 2 ( C ) C 3 given by γ ( γ τ , c τ + d , c ) is injective, and since

dim SL 2 ( C ) = 3 = dim C 3 ,

by the fibre-dimension theorem, 𝜐 is also dominant. As 𝑟 is a non-zero polynomial, there is a non-empty Zariski open subset U 1 of C 3 such that 𝑟 never vanishes on 𝑈. This then gives a Zariski open subset U 0 of SL 2 ( C ) such that, for every γ U 0 , υ ( γ ) U 1 . This implies that, for γ U 0 , p ( z , j ( z ) ) does not vanish at z = γ τ , and hence p ( z , j ( z ) ) has order s = 0 at z = γ τ . ∎

With this proposition, we obtain the following special case of Theorem 1.2.

Corollary 6.3

The equation j ( z ) u = 0 has a Zariski dense set of solutions if and only if u { 0 , 1728 } .

Proof

If u { 0 , 1728 } , then for any τ H satisfying j ( τ ) = u , we have j ( τ ) 0 . We need to show that the solutions of the equation j ( z ) = u are Zariski dense. So suppose that p ( X , Y ) C [ X , Y ] is such that its zero locus contains all the solutions of j ( z ) = u . The solutions of j ( z ) = u are precisely SL 2 ( Z ) τ , where j ( τ ) = u , so p ( γ τ , j ( γ τ ) ) = 0 for all γ SL 2 ( Z ) . Therefore, p ( z , j ( z ) ) has positive order at γ τ for all γ SL 2 ( Z ) , so by Proposition 6.2, Y 0 u divides 𝑝, thus proving Zariski density.

On the other hand, if u { 0 , 1728 } , then for any τ H satisfying j ( τ ) = u , we have that j ( τ ) = 0 , so the solutions of j ( z ) = u lie in the proper Zariski closed subset given by Y 1 = 0 . ∎

Now we consider the behaviour of p ( z , j ( z ) ) towards the cusps of the fundamental domains. We write T Y : = ( T Y 0 , T Y 1 , T Y 2 ) . One can easily see that the order at the cusp of any given fundamental domain is at least ( e , N ) for some 𝑁, where e = deg T ( p ( X , T Y ) ) . This is not far from the actual behaviour at most cusps.

Proposition 6.4

Let p C [ X , Y ] . Then there is 0 M deg W ( Γ ( p ) ) such that, for all 𝛾 in a Zariski open dense subset of SL 2 ( Z ) , the function p ( z , j ( z ) ) has order ( e , M ) at the cusp of γ F , where 𝑒 is the degree of p ( X , T Y ) in 𝑇.

Proof

Let n = deg X ( p ) , e = deg T ( p ( X , T Y ) ) , and write

p ( X , T Y ) = k = 0 e T k p k ( X , Y ) ,

where each p k is homogeneous of degree 𝑘 in the variables 𝐘, namely

p k ( X , T Y ) = T k p k ( X , Y ) .

Since Γ ( p ) ( Z , W , C , T Y ) = Γ ( p ( X , T Y ) ) , we have

Γ ( p ) ( Z , W , C , T Y ) = k = 0 e T k Γ ( p k ) ( Z , W , C , Y ) ,

and so each Γ ( p k ) is still homogeneous of degree 𝑘 in 𝐘.

Recall that, for Im ( z ) + , letting q = exp ( 2 π i z ) , we have

j ( z ) = q 1 + O ( 1 ) , j ( z ) = 2 π i q 1 + O ( 1 ) , j ′′ ( z ) = 4 π 2 q 1 + O ( 1 ) .

Now fix some arbitrary γ = ( a b c d ) SL 2 ( Z ) . We have

p γ ( z , j ( z ) ) = q e ( c z + d ) n r ( γ z , c z + d , c ) + O ( q e + 1 z n + N )

for Im ( z ) + in the standard fundamental domain, where 𝑁 is the degree of Γ ( p ) in the variable 𝑊, and r ( Z , W , C ) : = Γ ( p e ) ( Z , W , C , 1 , 2 π i , 4 π 2 ) . More precisely, when c 0 , letting M = deg W ( r ) , we can also say

p γ ( z , j ( z ) ) = q e ( c z ) n r ( a c , c z , c ) + O ( q e z n + M 1 ) .

Therefore, the order of p γ ( z , j ( z ) ) at i is ( e , ( n + M ) ) , and so the order of p ( γ z , j ( γ z ) ) is ( e , M ) , unless the leading coefficient of r ( a c , W , c ) vanishes or c = 0 .

To conclude, since the map γ ( a , c ) from SL 2 ( C ) to C 2 is dominant, it suffices to prove that 𝑟 is non-trivial (like we did in the proof of Proposition 6.2). By Lemma 6.1, at α 2 = 2 π i , β = 4 π , U 1 2 = Y 1 Y 0 1 , U 2 = Y 2 Y 0 1 , there are V 1 , V 2 C ( U 1 , U 2 ) such that

r ( X , V 1 , V 2 ) = Γ ( p e ) ( X , V 1 , V 2 , 1 , 2 π i , 4 π 2 ) = p e ( X , Y 0 Y 0 , Y 1 Y 0 , Y 2 Y 0 ) = Y 0 e p e ( X , Y ) ,

where the last equality follows from the homogeneity of p e . Since by assumption p e 0 , this shows that 𝑟 is non-trivial, as desired. ∎

Example 6.5

It is easy to construct examples where a function has no exponential growth at some cusp. For instance, if p ( Y ) = 4 π 2 Y 0 + Y 2 , then p ( j ( z ) ) = 4 π 2 j ( z ) + j ′′ ( z ) is bounded for z i in the standard fundamental domain. However, after the transformation z 1 z , one gets 4 π 2 j ( z ) + z 4 j ′′ ( z ) + 2 z 3 j ( z ) , which has order ( 1 , 4 ) at i , since z 4 j ′′ ( z ) is the dominant term. Note that here 4 = deg W ( Γ ( p ) ) .

There are also simple examples where the dominant terms cancel out at all cusps. Take h ( Y ) = Y 1 2 Y 0 Y 2 . Then

h ( j ( γ z ) ) = ( c z + d ) 4 j ( z ) 2 ( c z + d ) 4 j ( z ) j ′′ ( z ) 2 c ( c z + d ) 3 j ( z ) j ( z )

has order ( 2 , 3 ) for c 0 , because j ( z ) j ′′ ( z ) ( j ( z ) ) 2 4 π 2 q 2 , and it has order ( 2 , 0 ) for c = 0 . On the other hand, deg W ( Γ ( h ) ) = 4 . Proposition 6.4 guarantees that, even though these cancellations may occur at all cusps, some term of maximal exponential growth is not cancelled, at least generically.

Finally, we compute the order of p ( z , j ( z ) ) at the points 𝜏 for which j ( τ ) = 0 , namely the orbits of 𝜌 and 𝑖. Here, the order is at least the maximum 𝜈 such that T ν divides respectively p ( X , T 3 Y 0 , T 2 Y 1 , T Y 2 ) (for the conjugates of 𝜌) and p ( X , T 2 Y 0 + 1728 , T Y 1 , Y 2 ) (for the conjugates of 𝑖). This estimate is not sharp when 𝑝 depends on Y 2 , as we show in an example below, so we restrict to p C [ X , Y 0 , Y 1 ] .

Proposition 6.6

Let p C [ X , Y 0 , Y 1 ] , τ H , u = j ( τ ) , and let 𝜇 be the order of j ( z ) u at z = τ . Then, for all 𝛾 in a Zariski open dense subset of SL 2 ( Z ) , the order of p ( z , j ( z ) ) at z = γ τ is the highest power of 𝑇 dividing p ( X , T μ Y 0 + u , T μ 1 Y 1 ) .

Proof

The proof is very similar to that of Proposition 6.4. Write

p ( X , T μ Y 0 + u , T μ 1 Y 1 ) = k = ν m T k p k ( X , Y ) ,

where each p k satisfies the homogeneity condition

p k ( X , T μ Y 0 , T μ 1 Y 1 ) = T k p k ( X , Y 0 , Y 1 )

and p ν 0 . Since

Γ ( p ) ( X , T μ Y 0 + u , T μ 1 Y 1 ) = Γ ( p ( X , T μ Y 0 + u , T μ 1 Y 1 ) ) ,

we have

Γ ( p ) ( T μ Y 0 + u , T μ 1 Y 1 ) = k = ν m T k Γ ( p k ) ( X , Y )

and each Γ ( p k ) satisfies the above homogeneity condition.

Fix α 0 C such that j ( z ) u α 0 ( z τ ) μ , and so also j ( z ) μ α 0 ( z τ ) μ 1 , for z τ , where by assumption μ α 0 0 . For γ = ( a b c d ) SL 2 ( Z ) , we have

p γ ( z , j ( z ) ) = ( c τ + d ) n r ( γ τ , c τ + d , c ) ( z τ ) ν + O ( ( z τ ) ν 1 )

as z τ , where

r ( Z , W , C ) : = Γ ( p ν ) ( Z , W , C , α 0 , μ α 0 ) .

It follows that the order of p γ ( z , j ( z ) ) at z = τ is 𝜈 as long as 𝑟 does not vanish. Since z γ z is a diffeomorphism, this coincides with the order of p ( z , j ( z ) ) at z = γ τ .

To conclude, since the map γ ( γ τ , c τ + d , c ) is injective on SL 2 ( C ) , hence dominant on C 3 , we only need to show that 𝑟 is non-trivial. Pick β 0 such that β 0 2 μ = α 0 1 and U 0 such that U 0 2 μ = Y 0 . By Lemma 6.1, at α 2 = μ α 0 , β = 0 , U 1 2 = ( β 0 U 0 ) 2 ( μ 1 ) Y 1 , there are V 1 , V 2 C ( U 1 , U 2 ) such that

r ( X , V 1 , V 2 ) = Γ ( p ν ) ( X , V 1 , V 2 , α 0 , μ α 0 ) = p k ( X , α 0 , Y 1 ( β 0 U 0 ) 2 ( μ 1 ) ) = p ν ( X , U 0 2 μ ( β 0 U 0 ) 2 μ , Y 1 ( β 0 U 0 ) 2 ( μ 1 ) ) = ( β 0 U 0 ) 2 k p ν ( X , Y 0 , Y 1 ) ,

where the last equality is implied by the homogeneity condition. It follows that 𝑟 is non-trivial, as desired. ∎

Example 6.7

The above method fails when 𝑝 depends on Y 2 . Indeed, to compute say the order of p ( j ( γ z ) ) at 𝜌, we would look at the maximum power of 𝑇 dividing

p γ ( X , T 3 Y 0 , T 2 Y 1 , T Y 2 ) .

However, if 𝑝 contains Y 2 , then

Γ ( p ) ( X , T 3 Y 0 , T 2 Y 1 , T Y 2 ) Γ ( p ( X , T 3 Y 0 , T 2 Y 1 , T Y 2 ) ) ,

breaking the very first steps of the argument.

The order can indeed be higher than expected at all the conjugates of 𝜌 or 𝑖. For instance, for the polynomial

p ( Y ) = Y 0 Y 2 2 3 Y 1 2 ,

the maximum power of 𝑇 dividing p ( T 3 Y 0 , T 2 Y 1 , T Y 2 ) is 4; however, the function p ( j ( z ) ) has order at least 5 at all conjugates of 𝜌: if j ( z ) α ( z γ ρ ) 3 for z γ ρ , then

p ( j ( z ) ) = α 2 ( z γ ρ ) 4 ( 6 2 3 3 2 + O ( z γ ρ ) ) = O ( ( z γ ρ ) 5 ) .

Corollary 6.8

For all p C [ Y 0 , Y 1 ] , h C [ Y 0 ] , and deg Y 1 ( p ) , if the function

P ( z ) : = p ( j ( z ) , j ( z ) ) h ( j ( z ) ) j ( z )

is non-constant, then it has a pole in ℍ or it has exponential growth in some fundamental domains.

Proof

Suppose that P ( z ) has no poles in ℍ and has no exponential growth in any fundamental domain. Without loss of generality, we may assume that 𝑝 and ℎ are coprime.

Let 𝛽 be a root of h ( Y 0 ) such that β { 0 , 1728 } , and let 𝜏 be such that j ( τ ) = β . In particular, Y 0 β does not divide 𝑝. By Proposition 6.2, p ( γ τ , j ( γ τ ) ) 0 for all γ SL 2 ( Z ) except for some proper Zariski closed set, and so 𝑃 has a pole at γ τ , a contradiction. Therefore, h ( Y 0 ) is of the form α Y 0 s ( Y 0 1728 ) t for some α C .

Now h ( j ( z ) ) j ( z ) has order 3 s + 2 at all the conjugates of 𝜌 and 2 t + at all the conjugates of 𝑖; thus p ( j ( z ) , j ( z ) ) must have at least the same order at those points. On writing p = Y 1 p ( Y 0 ) , Proposition 6.6 implies that each p ( Y 0 ) is divisible by

Y 0 s + 2 ( ) 3 and ( Y 0 1728 ) t + 2 .

In particular, whenever p 0 , we have

deg ( p ) deg ( p ) s + t + 2 ( ) 3 + 2 + s + t + + 7 6 ( ) ,

with strict inequality if p is not a constant multiple of

Y 0 s + 2 ( ) 3 ( Y 0 1728 ) t + 2 .

On the other hand, by Proposition 6.4, on a Zariski open dense set of fundamental domains, the denominator has order at most ( ( s + t + ) , 0 ) at the cusp, while the numerator has order at most ( deg ( p ) , 0 ) ; thus, whenever p 0 , we also have

s + t + + 7 6 ( ) deg ( p ) s + t + .

As deg Y 1 ( p ) and 7 6 > 1 , it follows at once that = and that p = p is a constant multiple of Y 0 s ( Y 0 1728 ) t Y 1 , and so that P ( z ) is constant. ∎

7 The main result

7.1 𝐣-homogeneous equations

Before tackling the general case of Theorem 1.2, we look at equations of the form F ( j ( z ) ) = 0 , where F C [ Y ] satisfies the homogeneity condition below.

Definition 7.1

The 𝐣-degree of F C [ X , Y ] is the degree of F ( X , Y 0 , T 2 Y 1 , T 4 Y 2 ) in 𝑇, which we denote deg j ( F ) . We say that 𝐹 is 𝐣-homogeneous if F ( X , Y 0 , T 2 Y 1 , T 4 Y 2 ) is homogeneous in the variable 𝑇.

One of the easiest examples of a 𝐣-homogeneous polynomial is F = Y 2 , which has 𝐣-degree 4. For the sake of exposition, we first sketch the proof of Zariski density for this 𝐹, namely for the equation j ′′ ( z ) = 0 .

First, we observe that, for γ = ( a b c d ) SL 2 ( Z ) ,

(7.1) j ′′ ( γ z ) = j ′′ ( z ) c 4 ( ( z + d c ) 4 + 2 ( z + d c ) 3 j ( z ) j ′′ ( z ) ) = j ′′ ( z ) c 4 h ( z + d c , z ) ,

where

h ( X , W ) = X 4 + 2 j ( W ) j ′′ ( W ) X 3 .

If we can find τ H such that j ′′ ( τ ) = 0 j ( τ ) , we are done by Proposition 5.4, so suppose by contradiction that this does not happen. In this case, j ( z ) / j ′′ ( z ) is bounded on the standard fundamental domain 𝔽: by construction, it cannot have poles in ℍ, and by looking at the 𝑞-expansions, it is also bounded for Im ( z ) + . This also implies that h ( τ + d c , τ ) 0 for any d c Q , τ H except possibly when j ( τ ) = 0 .

Second, under the above assumptions, we shall verify that h ( τ + r , τ ) 0 for all r R , τ H (Claim 7.3.1), and in turn deduce that

| h ( z + d c , z ) | ε | z + d c | 4

for all z F , for some ε > 0 (Claim 7.3.2).

In particular, for all z F , we have

| j ( γ z ) j ′′ ( γ z ) | | j ( z ) ε j ′′ ( z ) | | c z + d | 2 = | j ( z ) ε j ′′ ( z ) | Im ( γ z ) Im ( z ) 2 M Im ( γ z ) 3 ε

where 𝑀 is a bound for | j ( z ) / j ′′ ( z ) | on 𝔽, and so j ( z ) / j ′′ ( z ) 0 as Im ( z ) 0 . By the Schwarz Reflection Principle, j ( z ) / j ′′ ( z ) extends to a holomorphic function on ℂ that vanishes for all z R , and is thus constantly 0, a contradiction.

For a general 𝐣-homogeneous 𝐹, we just need to find an appropriate generalisation of equation (7.1) and fill the details in the above sketch.

Lemma 7.2

Let F C [ Y ] be 𝐣-homogeneous. Then there are polynomials p k C [ Y ] and h C [ X , Y ] such that, for all γ = ( a b c d ) SL 2 ( Z ) with c 0 , we have

F ( j ( γ z ) ) = F ( j ( z ) ) c N ( k = k 0 N p k ( j ( z ) ) F ( j ( z ) ) ( z + d c ) k ) = F ( j ( z ) ) c N h ( z + d c , j ( z ) ) ,

where N = deg j ( F ) , p N = F , and 0 p k 0 Y 1 C [ Y 0 ] with 2 k 0 .

Proof

Let 𝐹 be as in the hypothesis. By 𝐣-homogeneity, we have that

Γ ( F ) ( Z , W , C , Y ) = F ( Z , Y 0 , W 2 Y 1 , W 4 Y 2 + 2 C W 3 Y 1 ) = C N F ( Z , Y 0 , W 2 C 2 Y 1 , W 4 C 4 Y 2 + 2 W 3 C 3 Y 1 ) .

Therefore, we can write Γ ( F ) as

Γ ( F ) = C N k = 0 N p k ( Y ) W k C k ,

where N = deg j ( F ) . Let k 0 be the least integer such that p k 0 0 .

By a further application of 𝐣-homogeneity, we also have

Γ ( F ) = W N Y 1 N 2 F ( Z , Y 0 , 1 , Y 2 Y 1 2 + 2 C W Y 1 ) .

It follows at once that the terms of maximum degree in 𝑊, which make up ( W N / C N ) p N , are found by discarding 2 C / ( W Y 1 ) , and in particular, we discover that p N ( Y ) = F ( Y ) . Similarly, the terms of lowest degree in W / C are obtained by specialising at Y 2 = 0 and taking the least power of Y 1 , and so p k 0 Y 1 C [ Y 0 ] . Moreover, if 𝑡 is the degree of 𝐹 in Y 2 , we have that k 0 = N t , = N 2 t ; thus 2 k 0 . ∎

Theorem 7.3

For any irreducible 𝐣-homogeneous polynomial F ( Y ) C [ X , Y 0 , Y 1 ] , the equation F ( j ( z ) ) = 0 has a Zariski dense set of solutions.

Proof

Let 𝐹 be as in the hypothesis and fix the polynomials p k , ℎ as in the conclusion of Lemma 7.2.

If some p k ( j ( z ) ) / F ( j ( z ) ) has a pole in ℍ or exponential growth in some fundamental domain, we are done by Proposition 5.4. Therefore, we shall assume that this is not the case. In particular, we assume that h ( z + d c , j ( z ) ) has no pole in ℍ for any d c .

With this additional assumption, if F ( j ( τ ) ) = 0 , then F ( j ( γ τ ) ) = 0 for all γ SL 2 ( Z ) . Since 𝐹 is not divisible by Y 0 j ( τ ) , Proposition 6.2 implies that j ( τ ) = 0 . In particular, for any d c , h ( τ + d c , j ( τ ) ) = 0 implies that j ( τ ) = 0 .

Claim 7.3.1

For every u R , the function h ( z + u , j ( z ) ) has no zero in ℍ.

Proof of the claim

Suppose h ( τ + u , j ( τ ) ) = 0 for some ( τ , u ) H × R . As h ( Z , Y ) is monic in 𝑍, the analytic map ( z , u ) ( h ( z + u , j ( z ) ) , u ) has finite fibres, in particular of dimension zero. Then the Open Mapping Theorem implies that the image of any ball around ( τ , u ) contains an open neighbourhood of ( 0 , u ) . In particular, for every rational number 𝑟 arbitrarily close to 𝑢, there is τ r close to 𝜏 such that h ( τ r + r , j ( τ r ) ) = 0 .

Since ℎ is monic in 𝑍, the polynomial h ( τ + Z , j ( τ ) ) is not identically zero; thus, for 𝑟 sufficiently close to 𝑢, we have h ( τ + r , j ( τ ) ) 0 , and so τ r τ . Therefore, the τ r ’s can be chosen to accumulate at 𝜏 for r u . However, our assumptions imply that j ( τ r ) = 0 , so the τ r ’s lie in a closed discrete subset of ℍ (namely, the orbits of 𝜌 and 𝑖), a contradiction. ∎

Claim 7.3.2

There is ε > 0 such that

| h ( z + u , j ( z ) ) | ε | z + u | N

for all z F and u R .

Proof of the claim

Our current assumptions imply that each p k ( j ( z ) ) / F ( j ( z ) ) has neither a pole in F ̄ nor exponential growth in 𝔽. Since these functions are in 𝒫 rather than P ( w ) , their order at i is of the form ( e , 0 ) , thus at least ( 0 , 0 ) , and so they are bounded in F ̄ , say by M > 0 . In particular, since ℎ is monic in 𝑍, we have | h ( z + u , j ( z ) ) | > 1 2 | z + u | N as soon as | z + u | > 2 N M . On the other hand, | z + u | 2 N M defines a compact subset 𝐾 of F ̄ × R , and so the function

| h ( z + u , j ( z ) ) ( z + u ) N |

attains some minimum ε > 0 on 𝐾, since it does not vanish by Claim 7.3.1. The conclusion follows on taking ε = min { ε , 1 2 } . ∎

Therefore, for z F , we find that

| F ( j ( γ z ) ) | ε | F ( j ( z ) ) | c N | z + d c | N = ε | F ( j ( z ) ) | | c z + d | N ,

and in particular, for some M > 0 independent of 𝑧, we have

| p k 0 ( j ( γ z ) ) F ( j ( γ z ) ) | = | p k 0 ( j ( z ) ) | | c z + d | 2 | F ( j ( γ z ) ) | | p k 0 ( j ( z ) ) | ε | F ( j ( z ) ) | | c z + d | 2 N M Im ( γ z ) N 2 ,

where we have used that 2 < N and that, for z F , p k 0 ( j ( z ) ) / F ( j ( z ) ) is bounded while also

| c z + d | 2 = Im ( z ) Im ( γ z ) 3 2 Im ( γ z ) > 0 .

Therefore,

p k 0 ( j ( z ) ) F ( j ( z ) ) 0 as Im ( z ) 0 ;

thus, by Schwarz’s reflection principle, p k 0 ( j ( z ) ) / F ( j ( z ) ) has a holomorphic extension to ℂ that is constantly zero on ℝ, thus constantly zero on ℂ; hence p k 0 = 0 , a contradiction. ∎

Remark 7.4

Let S = { z H : F ( j ( z ) ) = 0 } for some 𝐹 as in Theorem 7.3. As observed in the proof, every τ S that is not in the SL 2 ( Z ) -orbit of 𝜌 or 𝑖 must also be a pole of some coefficient p k ( j ( z ) ) / F ( j ( z ) ) appearing in Lemma 7.2. Combining this information with Proposition 4.1, it follows that j ( S ) is infinite and every point of j ( S ) { 0 , 1728 } is an accumulation point of j ( S ) .

7.2 Proof of Theorem 1.2

For the rest of the section, fix some F C [ X , Y ] and let p k C [ Z , C , Y ] be polynomials such that

Γ ( F ) ( Z , W , C , Y ) = k = 0 N p k ( Z , C , Y ) W k ,

where N = deg W ( Γ ( F ) ) . Given a polynomial α C [ X , Y ] , we will use the notation[5]

α N : = { α s : s N } .

Given different polynomials α 1 , , α C [ X , Y ] , we use α 1 N α N to denote the set of all products between elements of different α t N .

Proposition 7.5

The polynomial p N is the sum of the terms of maximum 𝐣-degree in F ( Z , Y 0 , W 2 Y 1 , W 4 Y 2 ) . In particular, N = deg j ( F ) , p N is 𝐣-homogeneous, and p N does not depend on 𝐶.

Proof

Let X α Y 0 β 0 Y 1 β 1 Y 2 β 2 denote a monomial, so α , β 0 , β 1 , β 2 N . Observe that

Γ ( X α Y 0 β 0 Y 1 β 1 Y 2 β 2 ) = Z α Y 0 β 0 Y 1 β 1 W 2 β 1 + 3 β 2 ( W Y 2 + 2 C Y 1 ) β 2 .

Hence

deg W ( Γ ( X α Y 0 β 0 Y 1 β 1 Y 2 β 2 ) ) = 2 β 1 + 4 β 2

and the term accompanying W 2 β 1 + 4 β 2 is Z α Y 0 β 0 Y 1 β 1 Y 2 β 2 .

Now write β = ( β 0 , β 1 , β 2 ) and

F ( X , Y ) = ( α , β ) N 4 c α , β X α Y β ,

where Y β = Y 0 β 0 Y 1 β 1 Y 2 β 2 and c α , β C . Then, since Γ is a homomorphism by 1,

Γ ( F ) = ( α , β ) N 4 c α , β Γ ( X α Y β ) .

Since N = deg W ( Γ ( F ) ) , then

p N = ( α , β ) N 4 : 2 β 1 + 4 β 2 = N c α , β Z α Y 0 β 0 Y 1 β 1 Y 2 β 2 .

From this, we see that p N does not depend on 𝐶 and that p N is 𝐣-homogeneous. This expression also gives us that p N is the coefficient of W N in F ( Z , Y 0 , W 2 Y 1 , W 4 Y 2 ) , and so

N = deg W ( F ( Z , Y 0 , W 2 Y 1 , W 4 Y 2 ) ) = deg j ( F ) .

Definition 7.6

The 𝐣-order of 𝐹, denoted by ord j ( F ) , is the maximum power of 𝑇 dividing F ( X , Y 0 , T 2 Y 1 , T 3 Y 2 ) .

Proposition 7.7

Let k 0 be minimum such that p k 0 0 . Then p k 0 is the sum of the terms of minimum degree in 𝑊 of F ( Z , Y 0 , W 2 Y 1 , 2 C W 3 Y 1 ) . In particular,

k 0 = ord j ( F ) 2 deg Y 1 ( p k 0 )

and p k 0 does not depend on Y 2 .

Proof

We proceed as in the proof of Proposition 7.5. From

Γ ( X α Y 0 β 0 Y 1 β 1 Y 2 β 2 ) = Z α Y 0 β 0 Y 1 β 1 W 2 β 1 + 3 β 2 ( W Y 2 + 2 C Y 1 ) β 2 ,

we see that the smallest power of 𝑊 appearing in this expression is 2 β 1 + 3 β 2 , and it is accompanied by 2 β 2 C β 2 Z α Y 0 β 0 Y 1 β 1 + β 2 , which does not depend on Y 2 . Hence

p k 0 = ( α , β ) N 4 : 2 β 1 + 3 β 2 = k 0 c α , β 2 β 2 C β 2 Z α Y 0 β 0 Y 1 β 1 + β 2 ,

which also shows that p k 0 is the coefficient of W k 0 in F ( Z , Y 0 , W 2 Y 1 , 2 C W 3 Y 1 ) . Using the change of variables C = Y 2 / ( 2 Y 1 ) , we conclude that k 0 = deg j ( F ) , and since

k 0 = 2 β 1 + 3 β 2 2 ( β 1 + β 2 ) ,

we have k 0 2 deg Y 1 ( p k 0 ) , concluding the proof. ∎

Corollary 7.8

If F Y 1 N C [ X , Y 0 ] , then deg j ( F ) > 2 deg Y 1 ( p ord j ( F ) ) .

Proof

One can immediately verify that

deg j ( F ) = deg T ( F ( X , Y 0 , T 2 Y 1 , T 4 Y 2 ) ) deg T ( F ( X , Y 0 , T 2 Y 1 , T 3 Y 2 ) ) ord T = 0 ( F ( X , Y 0 , T 2 Y 1 , T 3 Y 2 ) ) = ord j ( F ) 2 deg Y 1 ( p ord j ( F ) ) ,

where ord T = 0 ( P ) is the maximum power of 𝑇 dividing 𝑃.

If the second inequality is an equality, then

F ( X , Y 0 , T 2 Y 1 , T 3 Y 2 ) = T m F ( X , Y 0 , Y 1 , Y 2 ) ,

where m = ord j ( F ) . In particular,

deg T ( F ( X , Y 0 , T 2 Y 1 , T 4 Y 2 ) ) = deg T ( T m F ( X , Y 0 , Y 1 , T Y 2 ) ) = m + deg Y 2 ( F ) .

If the first inequality is also an equality, then deg Y 2 ( F ) = 0 , and moreover 𝐹 is homogeneous in Y 1 of degree m 2 ; thus F Y 1 N C [ X , Y 0 ] . ∎

We can now prove the main result of this paper, Theorem 1.2, the statement of which is recalled below for the convenience of the reader.

Theorem 1.2

For any polynomial F ( X , Y ) C [ X , Y ] C [ X ] which is coprime to Y 0 ( Y 0 1728 ) Y 1 , the equation F ( z , j ( z ) ) = 0 has a Zariski dense set of solutions.

Proof

It suffices to prove the conclusion for 𝐹 irreducible, not in C [ X ] , and not a constant multiple of Y 0 , Y 0 1728 , or Y 1 . Let n = deg X ( F ) and write

F Γ = ( C X + D ) n Γ ( F ) ( A X + B C X + D , C X + D , C , Y ) = ( C X + D ) n k = 0 N p k ( A X + B C X + D , C , Y ) ( C X + D ) k = k = 0 n + N h k ( A , B , C , D , Y ) X k .

We recall that F γ = F Γ ( a , b , c , d , X , Y ) for any γ = ( a b c d ) SL 2 ( Z ) . If F C [ Y 0 ] , the conclusion follows by Corollary 6.3, so we may assume that this is not the case, and in particular that n + N > 0 .

We observe immediately that h n + N = C n p N ( A C , Y ) using Proposition 7.5. Moreover, by Proposition 7.7,

h 0 = D n k = 0 N p k ( B D , C , Y ) D k = D n k = ord j ( F ) N p k ( B D , C , Y ) D k .

We claim that, for γ = ( a b c d ) in some Zariski open dense subset of SL 2 ( Z ) , we have that h n + N ( a , b , c , d , Y ) has a factor r C [ Y ] such that the equation r ( j ( z ) ) = 0 has a Zariski dense set of solutions, or that h 0 / h n + N ( a , b , c , d , j ( z ) ) has a pole or exponential growth in some fundamental domain. In particular, for any one of those 𝛾’s, we find that

F γ ( z , j ( z ) ) = k = 0 n + N h k ( a , b , c , d , j ( z ) ) z k = 0

has a Zariski dense set of solutions (by Corollary 5.5 in the first case, and Proposition 5.4 in the second one); hence so does F ( z , j ( z ) ) = 0 (by Proposition 5.1), as desired.

To prove the claim, we distinguish three cases.

Suppose that p N is not in Y 0 N ( Y 0 1728 ) N Y 1 N C [ Z ] . Recall that p N is 𝐣-homogeneous by Proposition 7.5. Since h n + N = C n p N ( A C , Y ) , if p N depends on Y 2 , so does h n + N ( a c , Y ) for all but finitely many values of a c . If p N does not depend on Y 2 , then p N = Y 1 N h ( A C , Y 0 ) for some polynomial h C [ Z , Y 0 ] , and by assumption, h ( A C , Y 0 ) has at least one root distinct from 0 and 1728, seen as a polynomial in Y 0 (in an algebraic closure of C ( A C ) ), in which case so does h ( a c , Y 0 ) except for finitely many values of a c .

In either case, since the map γ a c from SL 2 ( C ) to ℂ is dominant, we get that, for all 𝛾 except on some proper Zariski closed subset of SL 2 ( Z ) , the polynomial

h n + N ( a , b , c , d , Y ) = c n p N ( a c , Y )

has an irreducible factor r C [ Y ] such that r ( j ( z ) ) = 0 has a Zariski dense set of solutions (by respectively Theorem 7.3, after noticing that the factors of a 𝐣-homogeneous polynomial are 𝐣-homogeneous, and Corollary 6.3).

Suppose that 𝐹 is in C [ X , Y 0 ] . By irreducibility of 𝐹, we have that F = F ( X , Y 0 ) is neither divisible by Y 0 nor Y 0 1728 . In particular, p N = Γ ( F ) = F ( Z , Y 0 ) is also neither divisible by Y 0 nor Y 0 1728 , which puts us back in the previous case.

Suppose that p N is in Y 0 N ( Y 0 1728 ) N Y 1 N C [ Z ] and 𝐹 is not in C [ X , Y 0 ] . Write p N = r ( Z ) Y 0 s ( Y 0 1728 ) t Y 1 . Since 𝐹 is irreducible, 𝐹 is also not in Y 1 N C [ X , Y 0 ] ; thus, by Corollary 7.8,

2 = 2 deg Y 1 ( p N ) = deg j ( p N ) = deg j ( F ) > 2 deg Y 1 ( p ord j ( F ) ) .

Since the map γ ( a c , b d , c ) from SL 2 ( C ) to C 3 is dominant, for 𝛾 in some Zariski open dense subset of SL 2 ( Z ) , we have

deg Y 1 ( p N ( a c , c ) ) = deg Y 1 ( p N ) > deg Y 1 ( p ord j ( F ) ) = deg Y 1 ( p ord j ( F ) ( b d , c , Y ) ) ,

in which case the function

p ord j ( F ) ( b d , c , j ( z ) ) p N ( a c , j ( z ) )

has a pole at some τ SL 2 ( Z ) ρ SL 2 ( Z ) i or exponential growth in some fundamental domain by Corollary 6.8 (recall that, by Proposition 7.7, we know that p ord j ( F ) does not depend on Y 2 ). If we fix some b d , 𝑐 as above, and a corresponding pole 𝜏 or a fundamental domain η F with exponential growth, then the function

h 0 ( b d , c , j ( z ) ) h n + N ( a c , j ( z ) ) = d n c n r ( a c ) k = ord j ( F ) N p k ( b d , c , j ( z ) ) j ( z ) s ( j ( z ) 1728 ) t j ( z ) d k

has neither a pole at 𝜏 nor exponential growth in η F only when 𝑑 satisfies a non-trivial polynomial equation over b d , 𝑐. Since γ ( b d , c , d ) is also a dominant map, for 𝛾 in a Zariski open dense subset of SL 2 ( Z ) , the above function has a pole at 𝜏 or exponential growth in η F , as claimed. ∎

7.3 Two more examples

Example 7.9

Let us apply Theorem 1.2 to get some information on the zeroes of the function j ′′′ . From the differential equation of the 𝑗-function (2.2), we see that

j ′′′ = 3 2 ( j ′′ ) 2 j j 2 1968 j + 2654208 2 j 2 ( j 1728 ) 2 ( j ) 3 .

By Theorem 1.2, the equation

(7.2) 3 j 2 ( j 1728 ) 2 ( j ′′ ) 2 ( j 2 1968 j + 2654208 ) ( j ) 4 = 0

has a Zariski dense set of solutions outside SL 2 ( Z ) ρ SL 2 ( Z ) i (this actually follows directly from Theorem 7.3, because the equation is 𝐣-homogeneous). Then these are also solutions of j ′′′ ( z ) = 0 .

Upon applying the transformation z 1 z , we get

(7.3) z 8 ( 3 j ( z ) 2 ( j ( z ) 1728 ) 2 j ′′ ( z ) 2 ( j ( z ) 2 1968 j ( z ) + 2654208 ) j ( z ) 4 ) + z 7 12 j ( z ) 2 ( j ( z ) 1728 ) 2 j ( z ) j ′′ ( z ) + z 6 12 j ( z ) 2 ( j ( z ) 1728 ) 2 j ( z ) 2 = 0 .

We can see that the ratios

12 j 2 ( j 1728 ) 2 j j ′′ 3 j 2 ( j 1728 ) 2 ( j ′′ ) 2 ( j 2 1968 j + 2654208 ) ( j ) 4 , 12 j 2 ( j 1728 ) 2 ( j ) 2 3 j 2 ( j 1728 ) 2 ( j ′′ ) 2 ( j 2 1968 j + 2654208 ) ( j ) 4

are equal to 0 at 𝑖 and 𝜌 and do not have exponential growth in any fundamental domain. However, we know that (7.2) has a zero τ SL 2 ( Z ) ρ SL 2 ( Z ) i . Therefore, the second ratio above has a pole at 𝜏, and so, for all large enough 𝑚, equation (7.3) has a zero near τ + m . This means that (7.2), and hence j ′′′ = 0 , has solutions near 1 τ + m which accumulate at 0.

Example 7.10

Given F ( z , j ( z ) ) = 0 , our strategy of the proof of Theorem 1.2 is to apply a generic SL 2 ( Z ) -transformation and show that, in the function F ( γ z , j ( γ z ) ) , the ratio of the coefficients of the lowest and highest powers of 𝑧 has a pole at some point in 𝜏 or has exponential growth at a cusp. In some cases, e.g. when F ( X , Y ) does not depend on 𝑋, we can keep things simple and just use the good old transformation z 1 z . Indeed, in this case, it turns out that the coefficient of the lowest power of 𝑧 does not depend on j ′′ and so we can apply Corollary 6.8. For instance,

j ′′ ( 1 z ) 2 = z 8 j ′′ ( z ) 2 + 4 z 7 j ( z ) j ′′ ( z ) + 4 z 6 j ( z ) 2 .

We give an example to show that the transformation 1 z and even a 1 z for any integer 𝑎 does not suffice in general (when 𝐹 depends on 𝑋), as the aforementioned coefficient may depend on j ′′ . Consider the function f ( z ) = 2 j ( z ) + z j ′′ ( z ) . After a z a 1 z transformation, we get

f ( a 1 z ) = 2 z 2 j ( z ) + ( a 1 z ) ( z 4 j ′′ ( z ) + 2 z 3 j ( z ) ) = a z 4 j ′′ ( z ) + z 3 ( 2 a j ( z ) j ′′ ( z ) ) .

We see that the coefficient of z 3 depends on j ′′ regardless of the value of 𝑎.

Funding source: Leverhulme Trust

Award Identifier / Grant number: ECF-2022-082

Award Identifier / Grant number: EP/X009823/1

Award Identifier / Grant number: EP/T018461/1

Funding statement: Vahagn Aslanyan was supported by Leverhulme Trust Early Career Fellowship ECF-2022-082 at the University of Leeds (where most of this work was done) and by EPSRC Fellowship EP/X009823/1 and DKO Fellowship at the University of Manchester. Sebastian Eterović and Vincenzo Mantova were supported by EPSRC Fellowship EP/T018461/1 at the University of Leeds.

Acknowledgements

We thank the referee for a thorough reading of the paper and for numerous suggestions that helped us improve the presentation.

References

[1] V. Aslanyan, Adequate predimension inequalities in differential fields, Ann. Pure Appl. Logic 173 (2022), no. 1, Article ID 103030. 10.1016/j.apal.2021.103030Search in Google Scholar

[2] V. Aslanyan, S. Eterović and J. Kirby, Differential existential closedness for the 𝑗-function, Proc. Amer. Math. Soc. 149 (2021), no. 4, 1417–1429. 10.1090/proc/15333Search in Google Scholar

[3] V. Aslanyan, S. Eterović and J. Kirby, A closure operator respecting the modular 𝑗-function, Israel J. Math. 253 (2023), no. 1, 321–357. 10.1007/s11856-022-2362-ySearch in Google Scholar

[4] V. Aslanyan and J. Kirby, Blurrings of the 𝐽-function, Q. J. Math. 73 (2022), no. 2, 461–475. 10.1093/qmath/haab037Search in Google Scholar

[5] V. Aslanyan, J. Kirby and V. Mantova, A geometric approach to some systems of exponential equations, Int. Math. Res. Not. IMRN 2023 (2023), no. 5, 4046–4081. 10.1093/imrn/rnab340Search in Google Scholar

[6] W. D. Brownawell and D. W. Masser, Zero estimates with moving targets, J. Lond. Math. Soc. (2) 95 (2017), no. 2, 441–454. 10.1112/jlms.12014Search in Google Scholar

[7] P. D’Aquino, A. Fornasiero and G. Terzo, Generic solutions of equations with iterated exponentials, Trans. Amer. Math. Soc. 370 (2018), no. 2, 1393–1407. 10.1090/tran/7206Search in Google Scholar

[8] S. Eterović, Generic solutions of equations involving the modular 𝑗 function, Math. Ann. 391 (2025), no. 4, 6401–6449. 10.1007/s00208-024-03082-6Search in Google Scholar PubMed PubMed Central

[9] S. Eterović and S. Herrero, Solutions of equations involving the modular 𝑗 function, Trans. Amer. Math. Soc. 374 (2021), no. 6, 3971–3998. 10.1090/tran/8244Search in Google Scholar

[10] F. P. Gallinaro, Solving systems of equations of raising-to-powers type, Israel J. Math. (2025), 10.1007/s11856-025-2778-2. 10.1007/s11856-025-2778-2Search in Google Scholar

[11] F. P. Gallinaro, Exponential sums equations and tropical geometry, Selecta Math. (N. S.) 29 (2023), no. 4, Paper No. 49. 10.1007/s00029-023-00853-ySearch in Google Scholar

[12] J. Kirby, Blurred complex exponentiation, Selecta Math. (N. S.) 25 (2019), no. 5, Paper No. 72. 10.1007/s00029-019-0517-4Search in Google Scholar

[13] S. Lang, Elliptic functions, 2nd ed., Grad. Texts in Math. 112, Springer, New York 1987. 10.1007/978-1-4612-4752-4Search in Google Scholar

[14] S. Lang, Complex analysis, 4th ed., Grad. Texts in Math. 103, Springer, New York 1999. 10.1007/978-1-4757-3083-8Search in Google Scholar

[15] K. Mahler, On algebraic differential equations satisfied by automorphic functions, J. Aust. Math. Soc. 10 (1969), 445–450. 10.1017/S1446788700007709Search in Google Scholar

[16] B. Zilber, Exponential sums equations and the Schanuel conjecture, J. Lond. Math. Soc. (2) 65 (2002), no. 1, 27–44. 10.1112/S0024610701002861Search in Google Scholar

[17] B. Zilber, Pseudo-exponentiation on algebraically closed fields of characteristic zero, Ann. Pure Appl. Logic 132 (2005), no. 1, 67–95. 10.1016/j.apal.2004.07.001Search in Google Scholar

[18] B. Zilber, The theory of exponential sums, preprint (2015), https://arxiv.org/abs/1501.03297. Search in Google Scholar

Received: 2023-12-15
Revised: 2025-07-31
Published Online: 2025-10-11

© 2025 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 21.10.2025 from https://www.degruyterbrill.com/document/doi/10.1515/crelle-2025-0067/html
Scroll to top button