KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION

advertisement
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC
EQUATION
MICHAEL A. BENNETT AND SANDER R. DAHMEN
Abstract. If F (x, y) ∈ Z[x, y] is an irreducible binary form of degree k ≥ 3
then a theorem of Darmon and Granville implies that the generalized superelliptic equation
F (x, y) = z l
has, given an integer l ≥ max{2, 7 − k}, at most finitely many solutions in
coprime integers x, y and z. In this paper, for large classes of forms of degree
k = 3, 4, 6 and 12 (including, heuristically, “most” cubic forms), we extend this
to prove a like result, where the parameter l is now taken to be variable. In the
case of irreducible cubic forms, this provides the first examples where such a
conclusion has been proven. The method of proof combines classical invariant
theory, modular Galois representations, and properties of elliptic curves with
isomorphic mod n Galois representations.
Contents
1. Introduction
2. Klein forms
2.1. Some invariant theory
2.2. The syzygy
3. From elliptic curves to Klein forms (and back)
4. Frey-Hellegouarch curves for Klein forms
4.1. Twisting and minimization at 3
4.2. Twisting and minimization at 2
4.3. Frey-Hellegouarch curves: conclusions
5. Galois representations attached to E/Q
6. Moduli of elliptic curves: constant n-torsion
6.1. Curves with isomorphic n-torsion
6.2. Irreducibility and ramification properties
7. The Chebotarev density theorem
8. The modular method
8.1. The main theorem
9. A cubic family
9.1. From Thue-Mahler equations to Thue equations
9.2. Solutions as convergents
9.3. Applying the method of Thue-Siegel
2
5
5
7
9
11
12
12
13
13
14
15
17
18
20
23
24
26
27
31
Date: June 2010.
1991 Mathematics Subject Classification. Primary 11D41, 11D59, Secondary 11G05, 11G30,
11J68.
Key words and phrases. Superelliptic equations, Galois representations, Thue-Mahler equations, elliptic curves with isomorphic n-torsion.
1
2
MICHAEL A. BENNETT AND SANDER R. DAHMEN
9.4. Continued fraction expansions to θ2
10. A quartic family
11. Higher degree families of Klein forms
12. Heuristics for cubic forms
13. Diagonal forms
14. Examples and computations
14.1. Solving Thue-Mahler equations
14.2. Studying elliptic curves at candidate levels
14.3. Small bounds for the exponents
15. Ackowledgments
Appendix A. Conductor calculations
A.1. Quartic Klein forms at p = 2
A.2. Conductor calculations at p k δn
References
31
35
38
40
42
45
45
46
51
52
52
52
53
55
1. Introduction
A classic result of Siegel [48], from 1929, is that the set of K-integral points
on a smooth algebraic curve of positive genus, defined over a number field K, is
finite. As an application of this to Diophantine equations, Leveque [29] showed that
if f (x) ∈ Z[x] is a polynomial of degree k ≥ 2 with, say, no repeated roots, and
l ≥ max{2, 5 − k} is an integer, then the superelliptic equation
(1)
f (x) = y l
has at most finitely many solutions in integers x and y. Already in a 1925 letter
from Siegel to Mordell (partly published in 1926 under the pseudonym X [47]),
Siegel had proved precisely this result in case l = 2 (and had remarked that his
argument readily extends to all exponents l ≥ 2). Via lower bounds for linear
forms in logarithms, Schinzel and Tijdeman [42] deduced that, in fact, equation (1)
has at most finitely many solutions in integers x, y and variable l ≥ max{2, 5 − k}
(where we count the solutions with y l = ±1, 0 only once). This latter result has
the additional advantage over Leveque’s theorem in that it is effective (the finite
set of values for x is effectively computable).
If we replace the polynomial f (x) with a binary form F (x, y) over Z (i.e. a
homogeneous polynomial in Z[x, y]) of degree k ≥ 3, with no repeated roots, then
work of Darmon and Granville (Theorem 1 of [8]) ensures that the corresponding
generalized superelliptic equation
(2)
F (x, y) = z l ,
gcd(x, y) = 1
has, for fixed l ≥ max{2, 7 − k}, at most finitely many solutions in integers x and
y. That the restriction of coprimality is a necessary one is readily observed; in case
(k, l) = (3, 3) or (4, 2), equation (2) corresponds, generically, to a curve of genus 1.
In the proof of the result of Darmon and Granville, Faltings’ theorem [18] plays an
analogous role to that of Siegel’s theorem for equation (1).
In what follows, our goal is to investigate the degree to which it is possible to
generalize the theorem of Darmon and Granville to prove a result akin to that of
Schinzel and Tijdeman, valid for binary forms rather than polynomials. That is,
we wish to address the following
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
3
Question. For a given binary form F (x, y) ∈ Z[x, y] of degree k, is the number of
integer 4-tuples (x, y, z, l) satisfying equation (2) and l ≥ max{2, 7 − k} finite?
For many quadratic forms F (and for certain forms with repeated factors in
Z[x, y]; see Theorem 1 of [8]), the answer to this question is clearly “no”. In
general, for forms of higher degree, the problem rapidly becomes more complicated
– to illustrate its difficulty, it is worthwhile noting that the affirmative answer to
this question, in case
F (x, y) = xy(x + y) or x(x2 − y 2 ),
and z 6= 0, ±1, is equivalent to the asymptotic version of Wiles’ theorem [58], née
Fermat’s Last Theorem, and to a special case of a result of Darmon and Merel [9],
respectively.
For irreducible forms (over Q[x, y]), the only results in the literature answering
our question affirmatively are for certain diagonal quartic forms, such as F (x, y) =
x4 + y 4 (together with “covering” forms of higher degree), due to Ellenberg [15]
and to Dieulefait and Jiménez Urroz [12]) (see also [2]). In particular, there are
hitherto no primitive, irreducible cubic forms for which the above question has been
addressed. A special case of the main result of this paper (Theorem 8.4 of Section
8) is the following:
Theorem 1.1. Suppose that F (x, y) ∈ Z[x, y] is an irreducible binary cubic form
and that SF is the set of primes dividing 2 ∆F , together with a prime at ∞, where
∆F is the discriminant of F . If the Thue-Mahler equation
(3)
F (x, y) ∈ Z∗SF
has no solutions in integers x and y, then there exists an effectively computable
constant l0 (depending on F ), such that there are no solutions to equation (2) in
integers x, y and z, and prime l > l0 . Consequently, equation (2) has at most
finitely many solutions in integers x, y, z and integer l ≥ 4.
Here, we denote by Z∗SF , the set of SF -units, that is, the nonzero integers u with
the property that if a prime p satisfies p | u, then p ∈ SF .
At first blush, the conditions imposed upon F by the insolubility of (3) appear
to be extremely restrictive. In particular, since 1 ∈ Z∗SF , the above theorem fails to
apply to any GL2 (Z)-equivalence classes of monic forms. Indeed, of the 190 classes
of primitive, irreducible cubic forms with |∆F | ≤ 1000, equation (3) has integer
solutions in every case! Despite this, in Section 12, we will sketch a heuristic which
indicates that this is the law of small numbers at play and, in fact, “almost all”
cubic forms F have the property that (3) has no solutions in integers. In Section
9, we will demonstrate this for an infinite family of cubic forms. A representative
form of smallest absolute discriminant to which we may apply Theorem 1.1 is
F (x, y) = 3x2 + 2x2 y + 5xy 2 + 3y 3 ,
with discriminant ∆F = −2063; there are 24 such classes of forms with (3) insoluble
and |∆F | ≤ 104 . It is worth remarking, at this juncture, that there are effective,
indeed efficient, algorithms for determining all solutions to equation (3) (see e.g.
Tzanakis and de Weger [53]).
We derive analogues to Theorem 1.1 for forms of degrees 4, 6 and 12 which are
somewhat less general, in that they apply only to Klein forms of these degrees.
As we shall see, the Klein forms are a density zero subset of the set of all binary
4
MICHAEL A. BENNETT AND SANDER R. DAHMEN
forms of degrees 4, 6 and 12. For these forms, the finiteness of solutions to (2)
is a consequence of the abc-conjecture (over Q) of Masser and Oesterlé. For more
general irreducible forms, this does not seem to be the case and one must apparently
combine a version of the abc-conjecture, valid in number fields (as in Elkies [16]),
with some careful descent arguments.
The results of this paper may be viewed as an attempt to develop methods for
solving Diophantine equations, arising from the modularity of Galois representations, which exhibit greater flexibility than those currently in the literature. In
particular, we wish to illustrate that such techniques are not simply limited to
ternary equations akin to the generalized Fermat equation. Our method of proof
of Theorem 1.1 and related results proceeds via construction of Frey or, if you like,
Frey-Hellegouarch curves over Q, corresponding to putative solutions to equation
(2) for Klein forms F (x, y) of index n. Here, 2 ≤ n ≤ 5 is an integer; the case n = 2
is that of nondegenerate cubic forms. Typically, in such matters, one is led to study
arithmetic properties of weight 2 cuspidal newforms of fixed level N . Previously,
the combination of the presence of one-dimensional forms at level N (equivalently,
elliptic curves E/Q of conductor N ), with the absence of rational isogenies for
the given Frey-Hellegouarch curve, has provided a substantial barrier to progress.
What is novel in the present paper is that we are able to overcome these difficulties
by exploiting the fact that the Frey-Hellegouarch curves corresponding to a fixed
Klein form have constant n-torsion (equivalently, isomorphic mod-n Galois representations). In a certain sense, this property is stronger than the existence of a
rational n-isogeny, and plays an analogous role in our proofs to that of the latter
in, for instance, [9].
The outline of this paper is as follows. In Section 2, we discuss the necessary
background information from classical invariant theory and define the notion of
a Klein forms. Section 3 details the connection between Klein forms and elliptic
curves. In Section 4, we describe how to attach families of Frey-Hellegouarch curves
to Klein forms. Section 5 is a brief summary of (well-known) results on Galois
representations arising from elliptic curves.
In Section 6, we introduce the fundamental new observation, key to the proof
of Theorem 1.1, that the families of Frey-Hellegouarch curves defined in Section
4 possess, for a fixed Klein form F of index n, constant n-torsion. Together with
technical hypotheses, this enables us to eliminate the possibility of the corresponding mod l Galois representations arising from modular forms of dimension 1 at
certain levels N , for suitably large prime l. Section 7 contains the information we
require from an effective version of the Chebotarev density theorem to quantify
this last statement. In Section 8, we appeal to the so-called modular method, based
upon modularity of the canonical mod l Galois representations associated to our
Frey-Hellegouarch curves, together with level-lowering, to state our main results.
The remainder of the paper is devoted to discussing the applicability of the
results of Section 8. In particular, in Sections 9, 10 and 11, we appeal to our main
theorem to answer our question on the finiteness of solutions to (2) affirmatively
for families of binary cubic, quartic, sextic and duodecic (degree 12) forms. As a
sample of the results from these sections, we have
Theorem 1.2. If a is an integer such that a2 + 9a + 81 is squarefree and neither
a nor −a − 9 is in the set {−4, 8, 22, 31}, then the Diophantine equation
3x3 − ax2 y − (a + 9)xy 2 − 3y 3 = z n
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
5
has at most finitely many solutions in nonzero integers x, y, z and n, with gcd(x, y) =
1 and n ≥ 4.
This provides an infinite family of forms F satisfying the hypotheses of Theorem
1.1. The proof of Theorem 1.2 is by no means straightforward. Indeed, it requires
the solution of an infinite family of Thue-Mahler equations; as far as we are aware,
this is the first “nontrivial” such family to be solved where the corresponding set of
nonarchimedean primes is unbounded (in both cardinality and height). Additionally, the techniques of Section 9 are quite distinct from those of the remainder of
the paper and may be of independent interest.
In Section 12, we discuss heuristics which indicate that Theorem 1.1 is applicable
to a density one subset of the set of all cubic forms. The remaining sections are of a
less speculative nature, being concerned with the specialization of our arguments to
diagonal forms (Section 13), and with explicit computations and examples (Section
14), chosen to illustrate the utility of our methods.
2. Klein forms
2.1. Some invariant theory. We will begin by describing a number of results from
classical invariant theory; these are well covered in Klein [25] or Hilbert [23]. More
recent references for what we require along these lines are the book of Olver [37]
and, from a more arithmetic perspective, the thesis of Edwards [13], corresponding
article [14], and the survey of Beukers [3].
Our starting point is a topic central to invariant theory in the 19th century,
namely the theory of invariants and covariants for binary forms. Let F ∈ Q[x, y]
be such a form, say
F (x, y) =
k
X
αi xk−i y i ,
i=0
and M =
a
c
b
d
∈ GL2 (Q). Then by taking
(F ◦ M )(x, y) = F (ax + by, cx + dy),
we obtain a right action of GL2 (Q) on the set of nondegenerate binary forms over
Q of given degree k. Suppose that C is a form in x and y with coefficients that are
homogeneous polynomials over Q in the coefficients of F , (α0 , α1 , . . . , αk ); we write
C = C(F ) to emphasize its dependence on the form F . Call C(F ) a covariant of
F if there exists an integer p ≥ 0 such that
C(F ◦ M ) = det(M )p C(F ) ◦ M
for all M ∈ GL2 (Q). We define p to be the weight of the covariant C. If a covariant
I depends only on the coefficients (α0 , α1 , . . . , αk ) and not upon x and y, so that
I(F ◦ M ) = det(M )p I(F ),
we call I an invariant of weight p for the form F .
A theorem of Gordan [20] from 1868 asserts, given a fixed degree k, the existence
of a finite set C1 , C2 , . . . , Ct of covariants such that every covariant of a binary form
F of degree k is a polynomial over Q in the Cj ’s (such a set is nowadays termed
6
MICHAEL A. BENNETT AND SANDER R. DAHMEN
a Hilbert basis). For binary forms of degrees 3, 4, 6 and 12, there are in fact the
following numbers of independent invariants and covariants:
degree
# invariants
# covariants
3
1
4
4
2
5
6
5
26
12
109
949
In the simplest case (at least for our purposes), that of cubic forms, the invariants
are, up to scaling, just powers of the discriminant of the binary form:
∆F = 18 α0 α1 α2 α3 + α12 α22 − 27 α02 α32 − 4 α0 α23 − 4 α13 α3 .
The independent covariants may be taken as ∆F (which has weight 6), the form F
itself (of weight 0), the Hessian of F , i.e. the quadratic form
1 F
Fxy = (3α0 α2 − α12 )x2 + (9α0 α3 − α1 α2 )xy + (3α1 α3 − α22 )y 2
H(x, y) = xx
4 Fxy Fyy (of weight 2) and the Jacobian determinant of F and H, in this case the cubic form
Fx Fy G(x, y) = Hx Hy (of weight 3). Here, as is customary, Fx , Fy , etc, refer to corresponding partial
derivatives. Connecting all four covariants, one has the syzygy
(4)
4 H(x, y)3 + G(x, y)2 = −27 ∆F F (x, y)2 .
It is this last identity (and its analogues) that we will exploit for Diophantine
purposes.
For forms F of degree k > 3, we must narrow our focus considerably as, in
general, we are unable to deduce a ternary relationship akin to (4). Let us define
binary forms Fn (x, y) for n = 2, 3, 4 and 5 as follows:
(5)
F2 (x, y) =
F3 (x, y) =
F4 (x, y) =
F5 (x, y) =
xy(x + y),
y(x3 + y 3 ),
xy(x4 + y 4 ),
xy(x10 − 11x5 y 5 − y 10 ).
These forms arise in Klein’s [25] classification of the finite subgroups of AutQ (P1 ),
where F2 , F3 , F4 and F5 are homogenizations of polynomials whose roots correspond
to the vertices of an equilateral triangle, tetrahedron, octahedron and icosahedron,
respectively, after projection to the complex plane. We call F a Klein form if
F = Fn ◦ M for some n ∈ {2, 3, 4, 5} and M ∈ GL2 (Q); the integer n is uniquely
determined by the Klein form and is termed its index.
As it transpires, via a classic theorem of Gordan [21], if F is a binary form of
degree k ≥ 3 with nonzero discriminant, then F is a Klein form precisely when the
fourth transvectant τ4 (F ) vanishes. Transvectants are a class of covariants which
are typically defined via either direct appeal to the representation theory of SL2
and the Clebsch-Gordan formulae, or, through the use of differential operators (see
e.g. Olver [37]). We will not provide an explicit definition for τ4 (F ) in general
(the interested reader is directed to Appendix A of [13]), but instead note that
τ4 (F ) ≡ 0 is equivalent to the simultaneous vanishing of certain quadratic forms in
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
7
the coefficients of F . If F is cubic, then τ4 (F ) is, in fact, identically zero. Otherwise,
writing
(6)
F (x, y) =
k
X
αi xk−i y i ,
i=0
we have τ4 (F ) ≡ 0 for a form of degree 4, precisely when
(7)
12 α0 α4 − 3 α1 α3 + α22 = 0.
For forms of degree 6, if α0 6= 0, then τ4 (F ) ≡ 0 is equivalent to
(8)
10 α0 α4 − 5 α1 α3 + 2 α22 = 0,
25 α0 α5 − 5 α1 α4 + α2 α3 = 0,
50 α0 α6 − 2 α2 α4 + α32 = 0.
Finally, for forms of degree 12, again if α0 6= 0, then τ4 (F ) being identically zero is
equivalent to the simultaneous vanishing of the following 9 quadratic forms:
(9)
44 α0 α4 − 33 α1 α3 + 15 α22 ,
55 α0 α5 − 22 α1 α4 + 6 α2 α3 ,
330 α0 α6 − 55 α1 α5 − 20 α2 α4 + 18 α32 ,
55 α0 α7 − 5 α2 α5 + 2 α3 α4 ,
440 α0 α8 + 55 α1 α7 − 30 α2 α6 − 5 α3 α5 + 8 α42 ,
198 α0 α9 + 44 α1 α8 − 5 α2 α7 − 6 α3 α6 + 3 α4 α5 ,
660 α0 α10 + 198 α1 α9 + 16 α2 α8 − 19 α3 α7 − 2 α4 α6 + 5 α52 ,
3630 α0 α11 + 1320 α1 α10 + 270 α2 α9 − 52 α3 α8 − 45 α4 α7 + 25 α5 α6 ,
and
21780α0 α12 + 9075α1 α11 + 2670α2 α10 + 171α3 α9 − 284α4 α8 − 25α5 α7 + 75α62 .
For forms of degree 6 and 12, there are additional conditions in case α0 = 0 – we
have little need for them here. We direct the reader to Appendix A of [13] for
details.
It follows from (7), (8) and (9) that a Klein form with α0 6= 0 is entirely determined by the values of its first four coefficients α0 , α1 , α2 and α3 . We will refer to
this fact frequently in the sequel and, here and henceforth, adopt the shorthand
(α0 , α1 , α2 , α3 )k =
k
X
αi xk−i y i ,
i=0
for a Klein form of degree k. One may note, if α0 6= 0, that a quadruple of rationals
(α0 , α1 , α2 , α3 ) corresponds, via scalar multiplication, to precisely one Klein form
F (x, y) ∈ Z[x, y] of each degree k ∈ {3, 4, 6, 12}, with positive leading coefficient.
Supposing αi ∈ Z for each i = 0, 1, 2, 3, however, does not guarantee that the
associated Klein form (α0 , α1 , α2 , α3 )k is actually in Z[x, y].
2.2. The syzygy. Our main purpose in studying Klein forms is that certain of
their covariants satisfy a ternary relationship, closely analogous to (4). This will
prove important in the sequel, for our construction of Frey-Hellegouarch curves
corresponding to solutions to (2). Let F be a Klein form of degree k ∈ {3, 4, 6, 12}
and index n = 6 − 12/k. As in the case k = 3, we define the Hessian of F as
Fxx Fxy 1
(10)
H(x, y) =
(k − 1)2 Fxy Fyy 8
MICHAEL A. BENNETT AND SANDER R. DAHMEN
and the Jacobian determinant of F and H by
1 Fx
(11)
G(x, y) =
k − 2 Hx
Fy .
Hy Then, analogous to (4), one has the classical syzygy
4H(x, y)3 + G(x, y)2 = dn F (x, y)n ,
(12)
∗
for certain dn ∈ Q . If F (x, y) ∈ Z[x, y], then the same is true of G(x, y) and
H(x, y), and dn is, in fact, a nonzero integer. Specifically, we can write
dn = −2in 3jn δn
(13)
where δn is an integer satisfying
∆F = cn δnk(k−1)/6 ,
(14)
and in , jn and cn are as follows:
n in
2 0
3 8
4 4
5 8
jn
3
0
3
3
cn
1
−33
−28
525
Equation (14) determines δn and hence dn uniquely, if n = 2 or n = 4, and up to
sign if n = 3 or n = 5. For the sake of completeness, we will provide an expression
for δn in terms of the coefficients of F (see e.g. Appendix A of Edwards [13]). Write
our Klein form F (x, y) ∈ Z[x, y] as
F (x, y)
=
ax3 + bx2 y + cxy 2 + dy 3 ,
F (x, y)
=
ax4 + bx3 y + 3cx2 y 2 + dxy 3 + ey 4 ,
F (x, y)
=
ax6 + bx5 y + 5cx4 y 2 + 10dx3 y 3 + 5ex2 y 4 + f xy 5 + gy 6 ,
and
F (x, y) = ax12 + bx11 y + 11cx10 y 2 + 55dx9 y 3 + 165ex8 y 4 + 66f x7 y 5 + 11gx6 y 6
+66hx5 y 7 + 165ix4 y 8 + 55jx3 y 9 + 11kx2 y 10 + lxy 11 + my 12 ,
for index 2, 3, 4 and 5 respectively, where a, b, c, . . . , m are integers (the local conditions for index 3, 4 and 5, such as the fact that α2 is divisible by 11, in case n = 5,
are immediate consequences of (7), (8) and (9)). Then we have
δ2
= b2 c2 − 4ac3 − 4b3 d + 18abcd − 27a2 d2 = ∆F ,
δ3
=
δ4
= −15d2 + 10ce − bf + 6ag.
8ace + bcd − ad2 − eb2 − 2c3 ,
There is no such polynomial expression for δ5 , but since we have (a, b) 6= (0, 0) for
a Klein form, δ5 can always be obtained from the identities
2ag − 7bf + 140ce − 105d2 ,
5aδ5
=
5bδ5
= −420de + 126cf − 5bg + 84ah.
For future use, we will have need of the following result.
Proposition 2.1. The resultant of a binary form F of degree k with its Hessian
H satisfies
(15)
Res(H(x, y), F (x, y)) = (−1)k ∆2F .
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
9
Proof. To prove this, we begin by considering the identity
(k − 1) (Fx Fy − x y H) = k Fxy F,
readily established by equating coefficients. It follows that
Res(F, Fx Fy ) = Res(F, x y H)
and hence
Res(F, Fx ) Res(F, Fy ) = Res(F, x) Res(F, y) Res(F, H).
Without loss of generality, we may assume that F (1, 0)F (0, 1) 6= 0. Since
Res(F, Fx ) = (−1)k(k−1)/2 F (1, 0)∆F and Res(F, Fy ) = (−1)k(k−1)/2 F (0, 1)∆F ,
while
Res(F, x) Res(F, y) = (−1)k F (1, 0) F (0, 1),
we obtain (15).
3. From elliptic curves to Klein forms (and back)
There is a strong connection between Klein forms and elliptic curves, whose
arithmetic consequences we wish to exploit. Let us consider an elliptic curve in
Weierstrass form
E : y 2 + a1 xy + a3 y = x3 + a2 x2 + a4 x + a6 ,
(16)
where the ai lie in Q, say. Define, as usual,
b2
b4
b6
b8
= a21 + 4a2 ,
= 2a4 + a1 a3 ,
= a23 + 4a6 ,
= a21 a6 + 4a2 a6 − a1 a3 a4 + a2 a23 − a24
and, for integer n ≥ 2, let
Y
ψnE (x) =
(x − xP )
{P,−P }⊂E[n],
ord(±P )=n
denote the monic (division) polynomial with roots the x-coordinates of the points
of E with exact order n. One readily checks that
4ψ2E (x)
=
4x3 + b2 x2 + 2b4 x + b6 ,
3ψ3E (x)
=
3x4 + b2 x3 + 3b4 x2 + 3b6 x + b8 ,
2ψ4E (x)
=
2x6 + b2 x5 + 5b4 x4 + 10b6 x3 + 10b8 x2 + (b2 b8 − b4 b6 )x + b4 b8 − b26 ,
and
5 ψ5E (x) = 32 ψ22 ψ4 − 27 ψ33 .
For technical reasons, we will have use of a slightly modified version of ψ5E (x):
10
MICHAEL A. BENNETT AND SANDER R. DAHMEN
ψ̃5E (x) = x12 + b2 x11 + 11b4 x10 + 55b6 x9 + 165b8 x8 − 66(b4 b6 − b2 b8 )x7
+11(−b2 b4 b6 − 15b26 + b22 b8 + 10b4 b8 )x6 − 66(b2 b26 − b2 b4 b8 + b6 b8 )x5
−165(b4 b26 − b2 b6 b8 + 5b28 )x4 − 55(5b36 − 6b4 b6 b8 + b2 b28 )x3
−11(b2 b36 − 2b2 b4 b6 b8 + 21b26 b8 + b22 b28 − 25b4 b28 )x2
+(−b22 b36 + 20b4 b36 + 2b22 b4 b6 b8 − 46b2 b26 b8 − b32 b28 + 30b2 b4 b28 + 95b6 b28 )x
−b2 b4 b36 + 25b46 + 2b22 b26 b8 − 56b4 b26 b8 − b22 b4 b28 + 27b2 b6 b28 − 125b38 .
The polynomials ψ5E and ψ̃5E are not unrelated; one can in fact show that they have
the same splitting field. We associate to the polynomials ψnE (x) (n = 2, 3 and 4)
and to ψ̃5E (x), binary forms, via homogenization:
E
ψn (x/y) y k if n = 2, 3 and 4, for k = 3, 4 and 6, respectively;
E
Kn (x, y) =
ψ̃5E (x/y) y 12 if n = 5.
For a singular curve E given by (16), we define forms KnE by the the same polynomial identities. If we denote by
∆E = −b22 b8 − 8b34 − 27b26 + 9b2 b4 b6
the discriminant of E and by ∆n the discriminant of the form (6 − n)KnE (for
n = 2, 3, 4, 5), then a direct computation yields
 4
2 ∆E
if n = 2,



−33 ∆2E if n = 3,
(17)
∆n =
−28 ∆5E if n = 4,


 25 22
if n = 5.
5 ∆E
Our motivation for introducing the forms KnE is provided by the following result.
Proposition 3.1. Let E be an elliptic curve over a number field L, and let n ∈
{2, 3, 4, 5}. Then KnE ∈ L[x, y] is a Klein form. Conversely, if F (x, y) ∈ L[x, y] is
a Klein form of index n with F (1, 0) 6= 0, then there exists an elliptic curve E/L
such that
(18)
F (x, y) = F (1, 0) KnE (x, y).
Proof. To show that KnE is a Klein form, one checks that the conditions for the
vanishing of the fourth transvectant τ4 (KnE ) are satisfied; this is a short calculation,
using the relation 4 b8 = b2 b6 − b24 . Furthermore, from (17) we immediately obtain
that KnE is nondegenerate, which concludes the proof that KnE is a Klein form.
Conversely, from the explicit equations for the vanishing of the τ4 (F ) (i.e. from
Pk
(7), (8) and (9)), we observe that the coefficients of F (x, y) = i=0 αi xk−i y i are
uniquely determined, via recursion, from the first four, α0 , α1 , α2 , α3 , provided
α0 6= 0. We define E as in (16), with a1 = a3 = 0 and
 α1 α2 α3
( 0 , α0 , α0 )
if n = 2,


 (α
3α1 α2
α3
,
,
)
if
n = 3,
4α0 2α0 4α0
(a2 , a4 , a6 ) =
α1
α2
α3
( 0 , 5α0 , 20α0 )
if n = 4,


 2α
α1
α2
α3
( 4α
,
,
)
if
n = 5,
0 22α0 220α0
whereby (18) holds. It remains to check that the discriminant of E is nonzero. This
follows directly from (17) and the fact that the discriminant of a Klein form is, by
definition, nonzero.
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
11
Under the multiplication by two map on E, the points of order 4 are, naturally
enough, mapped to points of order 2. This is expressed, at least at the level of
x-coordinates, by the identity
(19)
K2E (x4 − b4 x2 y 2 − 2b6 xy 3 − b8 y 4 , 4yK2E (x, y)) = K4E (x, y)2 .
As a consequence of this, if F is a Klein form of index 4, there necessarily exists a
Klein form G of index 2, and quartic forms p and q, all defined over the same field
as F , for which F (x, y)2 = G(p(x, y), q(x, y)). The forms G, p and q are readily
obtained from (19) (if F (1, 0) = 0, one needs to apply a suitable linear translation).
Note that if the sextic form F (x, y) ∈ Z[x, y], the same need not be true for the
corresponding cubic form G(x, y).
4. Frey-Hellegouarch curves for Klein forms
We construct Frey-Hellegouarch curves for Klein forms F (x, y) ∈ Z[x, y] by exploiting the fact that an elliptic curve in short Weierstrass form
Y 2 = X 3 + aX + b
has discriminant
−24 (4a3 + 27b2 ).
Multiplying identity (12) by 33 , we find that
4 (3 H(x, y))3 + 27 G(x, y)2 = 33 dn F (x, y)n ,
whereby the Frey-Hellegouarch curve
(20)
Ex,y : Y 2 = X 3 + 3 H(x, y)X + G(x, y)
has discriminant
∆(x, y)
=
−24 · 33 (4H(x, y)3 + G(x, y)2 )
=
−24 · 33 dn F (x, y)n .
From the formulae for dn (13), we may thus write
(21)
∆(x, y) = 24+in · 33+jn δn F (x, y)n .
Other fundamental quantities associated to Ex,y , which we record for later use, are
c4 (x, y)
=
−24 · 32 H(x, y),
c6 (x, y)
=
−25 · 33 G(x, y)
and
(22)
j(x, y) =
−28−in · 33−jn H(x, y)3
.
δn F (x, y)n
When we specialize (x, y) in (20) to integers (x0 , y0 ) with F (x0 , y0 ) 6= 0, we
obtain an elliptic curve Ex0 ,y0 over Q (given via a short Weierstrass model over Z).
For our purposes, it is important to understand arithmetic properties of Ex,y , for
arbitrary (x, y), not just those satisfying the generalized superelliptic equation (2).
From the equation for ∆(x, y), it is immediate that if F is a Klein form of
E
index n with F (1, 0) 6= 0, then E1,0 is an elliptic curve; the Klein form Kn 1,0 (x, y)
associated to it is simply the original Klein form F up to scaling and a linear
12
MICHAEL A. BENNETT AND SANDER R. DAHMEN
change of variables. Specifically, we have (as can be verified via any computer
algebra package)
(23)
1
F (x − α1 y, k α0 y) ,
α0
KnE1,0 (x, y) =
where k = 12/(6 − n).
In practice, we will typically consider quadratic twists of (20), chosen to minimize
its conductor, when (x, y) are specialized to a solution (x0 , y0 ) of (2). The quadratic
(t)
twist of (20) by t ∈ Q∗ is denoted by Ex,y , i.e.
(t)
Ex,y
: Y 2 = X 3 + 3 H(x, y)t2 X + G(x, y)t3 .
(t)
We will choose t such that Ex0 ,y0 has good reduction outside primes dividing
n ∆F F (x0 , y0 ); to demonstrate the existence of such a twist, we will derive explicit
(t)
(minimal) models for Ex,y in the next two sections and Appendix A.1 (though, for
simplicity, in case n = 5, our models are not necessarily minimal at 2).
4.1. Twisting and minimization at 3. For forms of index n ∈ {2, 4, 5}, we can
actually remove the factor 33+jn = 36 in the discriminant ∆(x, y) of the FreyHellegouarch curve (20) as follows. Let us define T (x, y) to be α1 x − α2 y, if n = 2,
α1 x4 − α2 x3 y + α4 xy 3 − α5 y 4 , if n = 4, and
α1 x10 − α2 x9 y + α4 x7 y 3 − α5 x6 y 4 + α7 x4 y 6 − α8 x3 y 7 + α10 xy 9 − α11 y 10 ,
√
if n = 5. Then translation by T (x, y), and twisting over Q( 3), transforms (20)
(3)
into a model for Ex,y given by
(24)
Y 2 = X3 + T X2 +
T2 + H
T 3 + 3T H + G
X+
.
3
27
It is relatively easy to show that (T 2 + H)/3, (T 3 + 3T H + G)/27 ∈ Z[x, y], in each
case. Fundamental quantities associated to (24) are
(25)
∆(x, y) = 24+in δn F (x, y)n ,
c4 (x, y) = −24 H(x, y),
c6 (x, y) = −25 G(x, y)
and
(26)
j(x, y) =
−28−in H(x, y)3
.
δn F (x, y)n
Note that if we specialize x and y to (coprime) integers here, the resulting model
over Z need not be minimal at 3.
4.2. Twisting and minimization at 2. For Klein forms with index n ∈ {3, 5}
and odd discriminant ∆F , we can remove the factor 24+in = 212 from ∆(x0 , y0 ),
again through choice of suitable twist.
Lemma 4.1. Suppose that F is a Klein form of degree 4 or 12, with ∆F odd. Then,
for any coprime x0 , y0 ∈ Z with F (x0 , y0 ) 6= 0 either the model (20) for Ex0 ,y0 or
its twist by −1 is not minimal at 2. In particular, for the corresponding t = ±1,
(t)
there exists a Weierstrass model over Z for Ex0 ,y0 with
c4 = −32 H(x0 , y0 ),
∆ = 33+jn δn F (x0 , y0 )n .
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
13
Proof. For a form F as in (6), with k = 4, we twist (20) by (−1)δ , where
1 if α1 ≡ 1 (mod 4) or α3 ≡ −1 (mod 4),
(27)
δ=
0 if α1 ≡ −1 (mod 4) or α3 ≡ 1 (mod 4).
Note that assuming ∆F (and hence δ3 ) to be odd is equivalent to
(28)
α1 α2 α3 + α0 α3 + α1 α4 ≡ 1 (mod 2),
which guarantees that at least one of α1 or α3 is also odd, while (7) ensures that α1
and α3 are distinct modulo 4. The interested reader can find further information
(including minimal models) for Klein forms of degree 4 in Appendix A.1. We
omit the (gory) details of the degree 12 case; they are essentially similar (if much
longer).
It is worthwhile to note that the minimization of the valuation of the conductor
at the prime 2 can be done independently from that at 3.
4.3. Frey-Hellegouarch curves: conclusions. Combining our models from the
preceding subsections, we arrive at a general result on primes dividing the minimal
discriminants of the Ex,y . Here and henceforth, let us denote by SF the set of
primes dividing n ∆F , together with the prime at ∞. Furthermore, we define, for
a nonzero integer m, νp (m) to be the largest power of p dividing m, and set, for
m, n ∈ Z \ {0}, νp (m/n) = νp (m) − νp (n).
Proposition 4.2. Let F (x, y) ∈ Z[x, y] be a Klein form of index n, with corresponding family of Frey-Hellegouarch curves Ex,y as given by (20) and let x0 , y0 be
coprime integers with F (x0 , y0 ) 6= 0. Then there exists t ∈ {±1, ±3} such that for
(t)
all primes p 6∈ SF we have that Ex0 ,y0 is semistable at p and
νp (∆min (Ex(t)
)) = n νp (F (x0 , y0 )).
0 ,y0
Proof. This follows from the preceding formulae for c4 and ∆, together with Proposition 2.1 and the well known fact that if a Weierstrass model over Z for an elliptic
curve E has p - c4 or p - ∆ for a prime p, then E is semistable at p and the model
is minimal at p.
5. Galois representations attached to E/Q
Let E/Q be an elliptic curve and n a positive integer. By ρE
n we denote the
mod n Galois representation GQ → GL2 (Z/nZ) induced by the natural action of
the absolute Galois group GQ = Gal(Q/Q) on the n-torsion points E[n] (and the
choice of a basis for E[n]). Recall the following standard result.
Proposition 5.1. Let E/Q be an elliptic curve with conductor N , n a positive
integer and p a prime. If p - nN , then ρE
n is unramified at p and
Trace(ρE
n (Frobp ))
Det(ρE
n (Frobp ))
≡ ap (E)
≡ p
(mod n),
(mod n).
In case n is prime, combining the theorem above with the Chebotarev density
theorem and the Brauer-Nesbitt theorem, we obtain the following well-known result.
Proposition 5.2. Let E and E 0 be elliptic curves over Q and n be prime. Then the
E0
semisimplifications of ρE
n and ρn are isomorphic if and only if, for all but finitely
many primes p, we have ap (E) ≡ ap (E 0 ) (mod n).
14
MICHAEL A. BENNETT AND SANDER R. DAHMEN
Since we desire to apply this result with n ∈ {2, 3, 4, 5}, we require something
analogous to Proposition 5.2 for the case n = 4. More generally, for n = le a
prime power, we can replace the Brauer-Nesbitt theorem by Carayol’s partial generalization [5, Thm. 1] for representations over local rings (Z/le Z in our case), at
the cost of assuming absolute irreducibility for the corresponding residual mod l
representation.
Proposition 5.3. Let E and E 0 be elliptic curves over Q, and let n = le for some
E0
prime l and positive integer e. Suppose that ρE
are absolutely irreducible.
l and ρl
0
E
Then ρE
n and ρn are isomorphic if and only if, for all but finitely many primes p,
we have ap (E) ≡ ap (E 0 ) (mod n).
6. Moduli of elliptic curves: constant n-torsion
As it transpires and crucially for our arguments, the Frey-Hellegouarch curves
associated to the Klein forms of index n actually describe families of elliptic curves
with constant n-torsion. Before we make this more precise, we introduce, following
Fisher [19], some terminology.
Definition 6.1. Let K be a number field, E and E 0 be elliptic curves over K, and
n be a positive integer. We call E n-congruent to E 0 is there exists an isomorphism
ψ : E[n] → E 0 [n] of Gal(K/K)-modules that respects the Weil pairing, i.e. for all
P, Q ∈ E[n] we have
en (ψ(P ), ψ(Q)) = en (P, Q).
We call E reverse n-congruent to E 0 if there exists an isomorphism ψ : E[n] → E 0 [n]
of Gal(K/K)-modules that reverses the Weil pairing, i.e. for all P, Q ∈ E[n] we
have
en (ψ(P ), ψ(Q)) = en (P, Q)−1 .
For our purposes, it is quite straightforward to classify elliptic curves that are
n-congruent to our Frey-Hellegouarch curves.
Proposition 6.2. Suppose that F ∈ K[x, y] is a Klein form of index n ∈ {2, 3, 4, 5}
with F (1, 0) 6= 0 and corresponding family of Frey-Hellegouarch curves Ex,y . Let
E be an elliptic curve over K. Then E is n-congruent to E1,0 if and only if E is
isomorphic over K to Ex,y for some x, y ∈ K.
Proof. Up to some scaling and linear changes of variables, this is immediate from
Theorem 9.4 of Fisher [19] (see also Rubin and Silverberg [40], [41], and Silverberg
[49]).
In case n = 2, this characterization provides us with all we require, since, for
elliptic curves E and E 0 over K, we have E[2] ' E 0 [2] as Gal(K/K) modules
precisely when E and E 0 are 2-congruent. The analogous statement for n ≥ 3,
however, is no longer true in general. Indeed, suppose that E/Q is an elliptic curve
and n ≥ 3 an integer. Write ỸE (n) for the functor from Q-schemes to sets which
is determined by denoting by ỸE (n)(K) the set of isomorphisms classes (E 0 , ψ),
where E 0 is an elliptic curve over K and ψ : E 0 [n] → E[n] is an isomorphism of
Gal(K/K)-modules. There exists, then, an affine curve YE (n) over Q representing
the functor ỸE (n), with the property that YE (n) decomposes over Q into φ(n)
smooth absolutely irreducible affine curves YE a (n), one for each a ∈ (Z/nZ)∗ . Let
en denote the Weil pairing. Then the K-rational points on the (moduli) curve
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
15
YE a (n) correspond to pairs (E 0 , ψ) as above, but with the extra condition that
the isomorphism ψ : E 0 [n] → E[n] satisfies en (ψ(P ), ψ(Q)) = en (P, Q)a . The
multiplication by b ∈ (Z/nZ)∗ map on E[n] induces an isomorphism between YE a (n)
and YE ab2 (n), whereby in order to determine all elliptic curves E 0 such that E[n] '
E 0 [n] it suffices to restrict attention to the moduli curves YE a (n) with a ∈ (Z/nZ)∗
distinct modulo ((Z/nZ)∗ )2 . In particular, for n = 3 and 4 (that is, for Klein forms
of degrees 4 and 6), it suffices to consider both n-congruent and reverse n-congruent
curves to a given Frey-Hellegouarch curve. In the case of index n = 5, we must treat
not only 5-congruent curves, but also curves arising from YE a (5) for one nonsquare
value of a modulo 5, say a = 2.
6.1. Curves with isomorphic n-torsion. Suppose that F ∈ Q[x, y] is a Klein
form of index n with F (1, 0) 6= 0. To complete the classification of all elliptic curves
E/Q with n-torsion isomorphic to the Frey-Hellegouarch curve E1,0 corresponding
to F , we begin by defining a companion (Klein) form F̃ by

if n = 2, 4,
 F
H
if n = 3,
(29)
F̃ =

(A, B, 11C, 55D)12 if n = 5,
where, for n = 5, we write F = (a, b, 11c, 55d)12 and set
A
= −27(7b6 − 504ab4 c + 11072a2 b2 c2 − 64000a3 c3 + 1024a2 b3 d
−36864a3 bcd + 110592a4 d2 ),
1620(b2 − 24ac)2 (b3 − 36abc + 216a2 d),
B
=
C
= −162(b2 − 24ac)(3b6 − 216ab4 c + 4288a2 b2 c2 − 12800a3 c3
+896a2 b3 d − 32256a3 bcd + 96768a4 d2 ),
D
=
108(b3 − 36abc + 216a2 d)(b6 − 72ab4 c + 1472a2 b2 c2 − 5632a3 c3
+256a2 b3 d − 9216a3 bcd + 27648a4 d2 ).
Remark 6.3. A more canonical choice, in case n = 2, would be to define F̃ = G.
For our applications, however, this provides no advantages (and introduces some
minor complications).
To a companion form F̃ , we associate a family of elliptic curves Ẽx,y as follows.
If n = 2, we simply take Ẽx,y = Ex,y . For n = 3, we define Ẽx,y to be following
twist of the standard family of curves associated to F̃ :
Ẽx,y : Y 2 = X 3 + 24 · 3 δ33 F (x, y)X − 23 δ34 G(x, y).
In this case, the corresponding discriminant and j-invariant are given by
(30)
−212 33 δ3 F (x, y)3
˜
.
∆(x,
y) = 212 · 33 δ38 H(x, y)3 and j̃(x, y) =
H(x, y)3
Note that here we have
j(x, y)j̃(x, y) = 17282 .
(−δ )
For n = 4, we define Ẽx,y = Ex,y 4 . Our motivation for introducing Ẽx,y (in case
n = 3 or 4) is apparent in the following result.
Proposition 6.4. Let F ∈ Q[x, y] be a Klein form of index n ∈ {3, 4} with
F (1, 0) 6= 0 and corresponding family of curves Ex,y . If E/Q is an elliptic curve,
16
MICHAEL A. BENNETT AND SANDER R. DAHMEN
then E is reverse n-congruent to E1,0 if and only if E is isomorphic over Q to Ẽx,y
for some coprime x, y ∈ Z.
Proof. This can be found, modulo scaling and a linear change of variables, in [19,
p. 22] (after observing that −∆(x, y)/δ4 is a square).
Since −1 is a square modulo 5, the property of being reverse 5-congruent is equivalent to that of being 5-congruent. To complete our classification of curves with
isomorphic n-torsion to E1,0 , it remains, then, to find elliptic curves corresponding
to the rational points on YE a (5) for some nonsquare a, modulo 5, say a = 2. Let j5
and j50 denote the j-maps from XE (5) and XE 2 (5), respectively, to X(1). We wish
to relate j50 to j5 . For this purpose, we introduce the map J : X(1) → X(1) given
by
(31) J(j)
j(j 2 − 1456j − 3670016)3
(−7j + 4096)5
(−j + 1728)(j 3 − 1320j 2 + 9043968j + 1073741824)2
= −
+ 1728.
(−7j + 4096)5
=
In [7, p. 86], it is shown that, for a rational point P on XE (5), we have J(j5 (P )) =
j50 (P 0 ) for some rational point P 0 on XE 2 (5).
Now consider the Klein form F = (a, b, 11c, 55d)12 , where we assume that a 6= 0,
and as usual write j(x, y) for the j-invariant of the family Ex,y associated to F
0
the corresponding family of curves
(which we identify with j5 ). Denote by Ẽx,y
associated to the companion form F̃ , i.e.
0
Ẽx,y
: Y 2 = X 3 + 3 H(F̃ )(x, y)X + G(F̃ )(x, y),
and let j̃(x, y) denote its j-invariant. It is straightforward to check that J(j(1, 0))
is the image of a rational point under j̃(x, y) (and hence this j-map can be iden0
is the family of elliptic curves
tified with j50 ). It follows that some twist of Ẽx,y
2 (5). We denote
corresponding to the rational points on the modular curve XE1,0
this twist by Ẽx,y . For any given duodecic Klein form F ∈ Z[x, y], it is a simple
matter to explicitly write down the correct twist; in practice, we actually only need
its j-invariant, which is, of course, given by j̃(x, y) above. We summarize.
Proposition 6.5. Let F ∈ Q[x, y] be a Klein form of index n = 5 with F (1, 0) 6= 0
and corresponding family of curves Ex,y and let E/Q be an elliptic curve. Then E
is isomorphic over Q to Ẽx,y for some coprime x, y ∈ Z if and only if there exists
a Gal(Q/Q)-module isomorphism ψ : E[5] → E1,0 [5] for which, for all P, Q ∈ E[5],
we have en (ψ(P ), ψ(Q)) = en (P, Q)2 .
For n = 2, 3, 4 and 5 we can now describe all elliptic curves with isomorphic
n-torsion to a given one.
Theorem 6.6. Let F ∈ Q[x, y] be a Klein form of index n ∈ {2, 3, 4, 5} with
F (1, 0) 6= 0 and corresponding families of curves Ex,y and Ẽx,y . Let E/Q be an
elliptic curve. Then E[n] ' E1,0 [n] as Gal(Q/Q)-modules if and only if E is isomorphic over Q to Ex,y or Ẽx,y for some integers x, y (which can be taken to be
coprime if n 6= 2).
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
17
Next, we define S̃F = SF̃ = {p : p | n∆F̃ } ∪ {∞}. A direct computation gives

if n = 2, 4
 ∆F /cn
(26 ∆F /cn )2
if n = 3,
∆F̃ /cn =
 198 99
(2 · 3 · F (1, 0)110 ∆F /cn )2 if n = 5.
This yields
(32)

 SF
SF ∪ {2}
S̃F =

SF ∪ {p : p | 6 F (1, 0)}
if n = 2, 4
if n = 3,
if n = 5.
We conclude this subsection with a straightforward but useful result.
Lemma 6.7. Suppose that F ∈ Z[x, y] is a Klein form of index n with corresponding
families of curves Ex,y and Ẽx,y . Write j(x, y) for the j-invariant of Ex,y and
j̃(x, y) for the j-invariant of Ẽx,y . Let E/Q be an elliptic curve of conductor NE ∈
Z∗SF with j-invariant jE . If jE = j(x, y) for some coprime integers x, y, then
F (x, y) ∈ Z∗SF . If jE = j̃(x, y) for some coprime integers x, y, then F̃ (x, y) ∈ Z∗S̃ .
F
Proof. Let us suppose that jE = j(x, y) for coprime integers x, y, and assume that
F (x, y) 6∈ Z∗SF (the case where jE = j̃(x, y) is handled in a similar fashion). There
thus exists a prime p for which p | F (x, y) and p - n∆F . From the explicit equation
for j(x, y), together with Proposition 2.1, it follows that νp (j(x, y)) < 0 and hence
E has bad reduction at p. We conclude that NE 6∈ Z∗SF , a contradiction.
(t)
For future reference, as in the case of Ex,y , we define, for t ∈ Q∗ , Ẽx,y to be the
quadratic twist of Ẽx,y by t.
6.2. Irreducibility and ramification properties. The explicit formulae for our
families of elliptic curves Ex,y associated to a Klein form F of index n imply that
E
the corresponding Klein form Kn 1,0 is, up to a linear change of variables, a constant
multiple of F (see (23)). This provides useful information about the irreducibility
E
of ρn x,y .
Proposition 6.8. Let F ∈ Q[x, y] be a Klein form of index n ∈ {2, 3, 4, 5} with
F (1, 0) 6= 0. Consider the family of elliptic curves Ex,y . If F is irreducible, then for
E
any coprime integers x and y, the mod-n Galois representation ρn x,y is irreducible.
E
Moreover, if n ∈ {2, 3, 4}, then, for any such x and y, ρn x,y is irreducible precisely
when F has no linear factor over Q.
Proof. Because the mod-n Galois representations are constant on the family Ex,y ,
it suffices to check irreducibility for (x, y) = (1, 0). For n = 2, 3 and 4, the modular
E
E
curves X0 (n) and X1 (n) coincide, whereby ρn 1,0 is irreducible if and only if ψn 1,0
E1,0
has no linear factor over Q. By construction, this is equivalent to Kn
having
no linear factor over Q. Since this form is a constant multiple of F , up to linear
change of variables, the desired result follows for n = 2, 3 and 4. In case n = 5,
the irreducibility of F forces the field Q(E[5]x ) to be “large”, in the sense that
E
Gal(Q(E[5]/Q)) is not contained in a Borel subgroup, i.e. ρ5 x,y is irreducible. Remark 6.9. If a Klein form F of index n = 5 is reducible, it could still happen
E
that the corresponding mod 5 representation ρ5 x,y is irreducible. In any particular
18
MICHAEL A. BENNETT AND SANDER R. DAHMEN
case, this is easy to determine by checking whether or not E1,0 has a rational
5-isogeny.
E
Ẽ
Since ρn x,y ' ρn x,y , we immediately conclude that the above proposition also
holds with the families Ex,y replaced by Ẽx,y . Furthermore, irreducibility is not
affected by taking quadratic twists, so Ex,y can also be replaced by any quadratic
twist of Ex,y or Ẽx,y .
In light of Proposition 5.3, we would also like to have a criterion for the residual
E
mod 2 representation associated to ρ4 x,y to be absolutely irreducible.
Proposition 6.10. Let F ∈ Q[x, y] be a Klein form of index 4 with corresponding
family of Frey-Hellegouarch curves Ex,y . If F has no factor of degree ≤ 2 over Q
and δ4 is not a square in Q, then for any (nondegenerate) x, y, the mod-2 Galois
E
representation ρ2 x,y is absolutely irreducible.
Proof. If E/Q is an elliptic curve with a rational 2-torsion point, then ψ4E (x) necessarily has a factor of degree ≤ 2 over Q. Further, if ρE
2 is irreducible, but not
absolutely irreducible, then ∆E is a square in Q. Arguing as in the proof of Proposition 6.8, the result follows.
(t)
Although the conductor of Ex,y associated to a Klein form of index n may contain
(t)
many primes not dividing n ∆F , the ramification of Q(Ex,y [n]) can still be restricted
in a satisfactory manner, using Tate curves.
Proposition 6.11. Let F ∈ Z[x, y] be a Klein form of index n, x and y integers
(t)
with F (x, y) 6= 0, and let t ∈ Z \ {0} be such that Ex,y is semistable outside SF .
E (t)
Then the representation ρn x,y is unramified outside primes dividing n ∆F .
(t)
Proof. By Proposition 4.2 the minimal discriminant of Ex,y is an n-th power, up to
primes dividing n ∆F , and at primes not dividing n ∆F the elliptic curve has good
or multiplicative reduction. A simple application of the theory of Tate curves now
gives the proposition.
7. The Chebotarev density theorem
Our goal in this section is to effectively bound the smallest prime p with ap (E1 )
and ap (E2 ) distinct, given two elliptic curves E1 /Q and E2 /Q with E1 [n] 6' E2 [n].
To do this, we appeal to an effective version of the Chebotarev density theorem,
due to Lagarias, Montgomery and Odlyzko [27].
Theorem 7.1 (Lagarias, Montgomery, Odlyzko). Let K/Q be a nontrivial finite
Galois extension with Galois group G and denote by dK the absolute value of the
discriminant of K. Let C be a conjugacy class of G. Then there exists a (rational)
prime p such that Frobp = C and
log p ≤ c1 log dK ,
where c1 is an absolute effectively computable constant.
Proof. See [27, Theorem 1.1].
Note that the prime p referenced in Theorem 7.1 is necessarily unramified in K.
For our purposes, however, we require somewhat more, namely a prime p with Frobp
corresponding to a given conjugacy class, but also lying outside a specified finite
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
19
set of primes S, containing all the ramified primes. The need for this stems from
the fact that if K = Q(E[n]), where E/Q denotes an elliptic curve with conductor
N , then we wish to work with primes p - N n – but not all primes dividing N n are
necessarily ramified in Q(E[n]).
It is worthwhile observing at this juncture that, under the assumption of the
Generalized Riemann Hypothesis, the upper bound in Theorem 7.1 can be improved
to
p ≤ c2 log2 dK ,
where c2 is another absolute constant (see Serre [44]; one may take c2 = 70). In
fact, in [44], one finds an argument that enables one to obtain (again under the
assumption of the Generalized Riemann Hypothesis) a bound of the shape
2

X
log(q) ,
p ≤ c3 [K : Q]2 log([K : Q]) +
q∈S
for the smallest rational prime p with Frobp = C, satisfying the additional condition
that p 6∈ S, for S a finite set of primes containing those primes that ramify in K.
Here c3 is again an absolute constant (one can take c3 = 280). In a similar fashion,
we may derive the following variant of Theorem 7.1.
Theorem 7.2. Let K/Q be a finite Galois extension with Galois group G of degree
n > 1 and denote by dK the absolute value of the discriminant of K. Let S be a finite
set of primes, including those primes that ramify in K, and let C be a conjugacy
class of G. Then there exists a (rational) prime p 6∈ S such that Frobp = C and


X


,
log p ≤ c1 
n
log
2
+
n
log
q
+
2
log
d
K


q∈S
q-dK
where c1 is the same absolute effectively computable constant as in Theorem 7.1.
√
Proof. Define K 0 = K( ±D), where D is the product of the odd primes in S
that fail to ramify in K, and the sign is chosen so that dK 0 is minimal under the
restriction that 2 ramifies in K 0 if 2 ∈ S and fails to ramify in K. Arguing as in
[44, pp. 135–136], we deduce the inequality
Y
dK 0 ≤ 2n d2K
qn ,
q∈S
q-dK
whereby an application of Theorem 7.1, with K replaced by K 0 , together with
standard properties of Frobp , provides the required bound for p.
Corollary 7.3. Let K/Q be a finite Galois extension with Galois group G of degree
n > 1 and denote by dK the absolute value of the discriminant of K. Let S be a finite
set of primes, including those primes that ramify in K, and let C be a conjugacy
class of G. Then there exists a (rational) prime p 6∈ S such that Frobp = C and
X
log p ≤ κn
log q,
q∈S
where κn is an effectively computable constant depending only on n = [K : Q].
20
MICHAEL A. BENNETT AND SANDER R. DAHMEN
Proof. As is well known, there exists an effectively computable constant Cn , only
depending on n, such that for every prime q | dK we have νq (dK ) ≤ Cn . (Denoting
by DK the different of the extension K/Q, this follows easily from the facts that
dK = |NormK/Q (DK )| and νq (DK ) ≤ e − 1 + νq (e) for every prime q of K with
ramification index e for the extension K/Q). The result is now immediate from
Theorem 7.2.
An application of this corollary leads to our desired result on non-isomorphic
mod-n Galois representations associated to elliptic curves.
Proposition 7.4. Let E1 /Q and E2 /Q be elliptic curves with conductors N1 and
N2 , respectively, where N1 | N2 . Let n = le for some prime l and positive integer e,
i
and write ρi = ρE
n for i = 1 and 2. Suppose that ρ2 is unramified outside primes
2
dividing nN1 , that ρ2 is irreducible and, if e > 1, that ρE
is absolutely irreducible.
l
If ρ1 6' ρ2 , then there exists a prime p with p - nN1 , for which both
Trace(ρ1 (Frobp )) 6≡ Trace(ρ2 (Frobp ))
(mod n)
and
(33)
log p ≤ κn
X
log q,
q|nN1
where κn is an effectively computable constant only depending on n. In particular,
for this prime p, we have p | N2 or ap (E1 ) 6= ap (E2 ).
Proof. Consider the (continuous) homomorphism
Gal(Q/Q) → GL(2, Z/nZ) × GL(2, Z/nZ)
given by
σ 7→ (ρ1 (σ), ρ2 (σ)),
and denote by H its image and by K the fixed field of its kernel. Then K/Q
is Galois, unramified outside S := {p : p | nN1 }, and we can and shall identify
its Galois group with H. Propositions 5.2 and 5.3 guarantee the existence of an
element (a, b) ∈ H with
(34)
Trace(a) 6≡ Trace(b)
(mod n).
By our version of the effective Chebotarev density theorem, i.e. Corollary 7.3, we
conclude that there exists a prime p 6∈ S such that p satisfies (33) and (a, b) is a
Frobenius element above p. Thus, from (34),
Trace(ρ1 (Frobp )) 6≡ Trace(ρ2 (Frobp ))
(mod n).
The last statement of the proposition follows immediately, since if p - N2 (we already
know p - nN1 ), then Trace(ρi (Frobp )) ≡ ap (Ei ) (mod n) for i = 1, 2.
8. The modular method
Having constructed Frey-Hellegouarch curves from our Klein forms, with corresponding mod l Galois representations having suitable ramification properties, it
is relatively straightforward to translate this information into a statement about
congruences between modular forms.
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
21
Proposition 8.1. Let F be a Klein form of index n and suppose that x0 , y0 and
z0 6= 0 are integers such that
(35)
F (x0 , y0 ) = u0 z0l ,
gcd(x0 , y0 ) = 1,
where u0 ∈ Z∗SF and l > 163 is prime.
Denote by N0 the conductor of the
(t)
corresponding Frey-Hellegouarch curve E0 = Ex0 ,y0 , with t chosen so that E0 is
0
semistable outside SF . Then ρE
is modular of level
l
Y
(36)
N1 =
pνp (N0 ) .
p|n∆F
In particular, there exists a newform f of level N1 , weight 2 and trivial character,
whose coefficients lie in a number field Kf , and a prime L ⊂ OKf lying above l,
such that for all primes p with p - lN0 ,
(37)
ap (f ) ≡ ap (E0 )
(mod L),
while for those primes p with p - lN1 and p | N0 , we have
(38)
ap (f ) ≡ ±(p + 1)
(mod L).
Proof. Via work of Breuil, Conrad, Diamond and Taylor [4] (building on that of
0
Wiles [58]), we have that ρE
is modular of level N0 . Since l > 163, we may
l
0
appeal to a result of Mazur [31] to conclude that ρE
is irreducible. Now by level
l
E0
lowering (Ribet [38], [39]), the representation ρl is modular of level N0 /N 0 where
N 0 is any product of primes p for which E0 has multiplicative reduction at p and
l | νp (∆min (E0 )). Proposition 4.2, together with (35), now immediately show that
we can take N0 /N 0 to be equal to N1 , as given by (36). One might note that
0
N1 is not necessarily the Serre level of the representation ρE
l ; for the applications
we have in mind, this is not especially important. Congruences (37) and (38) are
well-known and follow, essentially, from comparing traces of Frobenii.
To apply a result of the flavour of Proposition 8.1, one would like to extract as
much arithmetic information as possible from the congruences (37) and (38). In the
most optimistic of worlds, they can be used to deduce an outright contradiction,
at least for l suitably large. The simpler situation, though paradoxically the one
which leads to the worse bound for l, is when the newform f (whose existence
is guaranteed by Proposition 8.1) is non-rational, i.e when Kf 6= Q. For such a
newform f of level N , we Q
know by a result of Kraus [26, Lemme 1] that aq (f ) 6∈ Z
for some prime q ≤ (N/6) p|N (1 + 1/p). In the interests of keeping our exposition
reasonably self-contained, and since the result will lead to subsequently cleaner
statements for our upper bounds, we will sharpen this slightly.
Lemma 8.2. Let f be a newform of level N . If for some positive integer n, an (f ) 6∈
Z, then aq (f ) 6∈ Z for some prime q - N with q ≤ B(N ), where
(39)
Y νX
p (N )
N Y
1
B(N ) =
1+
−
φ p[a/2] + 1.
6
p
a=0
p|N
p|N
Here, [a/2] denotes the greatest integer ≤ a/2.
22
MICHAEL A. BENNETT AND SANDER R. DAHMEN
Proof. From the classical theory of modular forms, we know (by appealing to
Riemann-Roch or simply by integrating over the boundary of a fundamental region) that for a nonzero modular form g of weight k with respect to a congruence
subgroup Γ, we have
X
k
[SL(2, Z) : ±Γ].
(40)
νP (g) =
12
P
Here, the sum is over a fundamental region (for the action of Γ) of the extended
upper half plane (one must be careful in defining the multiplicities νP as they can
be nonintegral at elliptic points and irregular cusps).
Now suppose that an (f ) 6∈ Z for some positive integer n. Then g = f − f is
nonzero for some conjugate f of f (which is also a newform of level N ). Since
Γ0 (N ) has no irregular cusps (or since the weight of g is even) and g is a cuspform,
we have νP (g) ≥ 1 at all cusps P . By applying (40), we obtain
νi∞ (g) ≤
1
[SL(2, Z) : Γ0 (N )] − #{cusps of Γ0 (N)} + 1.
6
Further, we have the well known formulae
[SL(2, Z) : Γ0 (N )] = N
Y
p|N
1
1+
p
and
#{cusps of Γ0 (N)} =
p (N )
Y νX
φ p[a/2] .
p|N a=0
It follows that an (g) =
6 0 (equivalently, an (f ) 6∈ Z) for some positive integer n
satisfying n ≤ B(N ). Since f is a newform, we may assume that n is prime and
also that n - N (since ap (f ) ∈ {±1, 0} for primes p | N ).
Combining the preceding two results, we have
Proposition 8.3. Let F be a Klein form of index n and suppose that x0 , y0 and
z0 6= 0 are integers satisfying (35) where u0 ∈ Z∗SF and l > 163 is prime. If the
newform f whose existence is guaranteed by Proposition 8.1 has the property that
[Kf : Q] > 1, then there exists an effectively computable absolute constant c such
that


Y
X
(41)
log l < c 
p2 
log p.
p∈SF
p∈SF
Proof. If q is prime, coprime to lN1 , then it follows from (37) and (38) that the
(rational) prime l divides either
NormK/Q (aq (f ) − aq (E0 ))
or NormK/Q (aq (f ) ∓ (q + 1)) ,
depending on whether q does or does not divide N0 , respectively. If these norms
are nonzero, the Hasse-Weil bounds thus imply that
√ 2 [K :Q]
(42)
l ≤ (1 + q) f .
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
23
Applying Lemma 8.2, we find that aq (f ) 6∈ Z (whereby the norms above are
nonzero) for some prime q not dividing N1 , with
Y νpX
(N1 ) N1 Y
1
q≤
1+
−
φ p[a/2] + 1.
6
p
a=0
p|N1
p|N1
8
5
2
Since N1 necessarily divides 2 · 3
p∈SF \{2,3} p , it follows, after a little work,
that
Y
p
√
p(p + 1).
1 + q ≤ 24 · 32
Q
p∈SF \{2,3}
If q = l, this implies (41), as desired. If q 6= l, then combining this inequality with
the fact that


Y
Y
1  8 5
[Kf : Q] ≤ dim(S2new (N1 )) ≤
2 ·3
p2  = 26 · 34
p2 ,
12
p∈SF \{2,3}
p∈SF \{2,3}
where the second inequality follows from Martin [30, Theorem 2], we may thus
conclude that




Y
Y
p
log l ≤ 27 · 34 
p2  log 24 · 32
p(p + 1) ,
p∈SF \{2,3}
p∈SF \{2,3}
and hence (41).
8.1. The main theorem. Collecting what we have proved so far, we obtain the
principal result of this paper.
Theorem 8.4. Let F be a Klein form of index n ∈ {2, 3, 4, 5} and SF denote the
set of primes dividing n ∆F , together with the archimedean prime. Suppose that F
is irreducible (in case n = 2 or 5), or that F contains no linear factors over Q[x, y]
(if n = 3), or no linear or quadratic factors over Q[x, y] (if n = 4). If n = 4,
further assume that δ4 , as given in Section 2, is not an integral square. To F ,
we associate a corresponding family of Frey-Hellegouarch curves Ex,y , as well as a
companion form F̃ , family of curves Ẽx,y , and set of primes S̃F . Let u0 ∈ Z∗SF and
suppose that (35) has a solution in integers x0 , y0 and (necessarily nonzero) z0 , and
(t)
prime l. Let N0 be the conductor of E0 = Ex0 ,y0 , where t is chosen such that E0
is semistable outside SF , and let N1 be given by (36). Then one of the following
occurs:
(i) there exists an effectively computable absolute constant c such that inequality
(41) holds, or
(ii) there exist integers x1 and y1 for which
(t)
Ex
,y
ρl
1
1
(t)
∗
0
' ρE
l , N (Ex1 ,y1 ) = N1 and F (x1 , y1 ) ∈ ZSF , or
(iii) there exist integers x1 and y1 for which
(t)
Ẽx
,y
ρl
1
1
(t)
∗
0
' ρE
l , N (Ẽx1 ,y1 ) = N1 and F̃ (x1 , y1 ) ∈ ZS̃F .
24
MICHAEL A. BENNETT AND SANDER R. DAHMEN
Proof. Let x0 , y0 and z0 be integers and l a prime, satisfying equation (2). Without
loss of generality, we may suppose that l > 163. Appealing to Proposition 8.1,
we deduce the existence of a weight 2 newform f , of trivial character and level
0
N1 ∈ Z∗SF given by (36), satisfying ρE
' ρfl . In case f is not rational (i.e.
l
[Kf : Q] > 1), inequality (41) is immediate from Proposition 8.3. If, on the other
hand, f is rational, let us denote by E the elliptic curve corresponding to f via the
Eichler-Shimura relation.
If E0 [n] 6' E[n] (as Galois modules), we may apply Proposition 7.4 (where we
appeal, for the irreducibility conditions, to Propositions 6.8 and 6.10, and, for the
ramification condition, to Proposition 6.11). Together with the Hasse-Weil bounds,
this result immediately implies (a stronger version of) inequality (41).
If E0 [n] ' E[n], then, from Theorem 6.6, E is necessarily isomorphic over Q to
(t)
(t)
either Ex1 ,y1 or Ẽx1 ,y1 , for some integers x1 and y1 . The remaining parts of (ii)
and (iii) therefore follow from Lemma 6.7.
We note that sometimes additional methods may be applied to show that we
E (t)
Ẽ (t)
0
0
cannot have ρl x1 ,y1 ' ρE
or ρl x1 ,y1 ' ρE
and thereby treat the corresponding
l
l
superelliptic equations (for large enough prime exponents). An example where
we can carry this out through a simple comparison of images of inertia is given in
Remark 10.3. Other arguments involving elliptic curves with complex multipication
can be found in Section 13.
An immediate corollary of Theorem 8.4, from which Theorem 1.1 is a direct
consequence (in case n = 2), is the following
Corollary 8.5. Let n ∈ {2, 3, 4, 5} and suppose that F (x, y) ∈ Z[x, y] is an irreducible Klein form of index n with companion form F̃ . If n = 4, suppose also that
δ4 is not the square of an integer. If the Thue-Mahler equations
F (x, y) ∈ Z∗SF and F̃ (x, y) ∈ Z∗S̃F
each have no solutions (in coprime integers x, y), then the generalized superelliptic
equation
F (x, y) = z l , gcd(x, y) = 1
o
n
.
has at most finitely many solutions in integers x, y, z and integer l ≥ max 2, 30−7n
6−n
We should remind the reader that F = F̃ in case n = 2 or 4. The remainder of
this paper is devoted to exploring applications of Theorem 8.4 and Corollary 8.5.
In particular, we will attempt to demonstrate that their attendant hypotheses are
frequently, perhaps usually, met.
9. A cubic family
In Section 12, we will sketch a heuristic indicating that Theorem 1.1 is applicable
to “almost all” cubic forms. Our goal in this section is to provide an explicit
infinite family of cubic forms F which satisfy the hypotheses of Theorem 1.1. To
achieve this, we will exhibit a family of forms for which the corresponding ThueMahler equations (3) possess no solutions whatsoever. To our knowledge, no infinite
family of Thue-Mahler equations with corresponding set of (noninert) primes S
of unbounded cardinality, has hitherto been completely solved (though treating
families of Thue equations or inequalities has become relatively commonplace; see
e.g. [28] and [55]).
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
25
Let us define
(43)
Fa,b (x, y) = bx3 − ax2 y − (a + 3b) xy 2 − by 3 ,
where a and b are coprime nonzero integers, so that
(44)
∆Fa,b = (a2 + 3ab + 9b2 )2 .
It follows that a cubic field corresponding to a root of Fa,b (x, 1) = 0 is Galois and
cyclic; indeed all cyclic cubic fields arise in this fashion (see e.g. [57]). The existence
of a nontrivial automorphism for these forms will prove helpful in the sequel; to be
specific, we have
(45)
Fa,b (x, y) = Fa,b (−x − y, x) = Fa,b (y, −x − y).
An advantage of working with forms of this shape is that the method of ThueSiegel may be used to deduce good “irrationality measures” for the roots of the
equation Fa,b (x, 1) = 0. Such an approach to solving corresponding Thue inequalities has been carried out in [28], [55] and [56]. In case b = 1, the forms
Fa,1 (x, y) have been termed the simplest cubic forms (with corresponding simplest
cubic fields). For our purposes, since we wish to find forms F for which equation
(3) is insoluble, it is clear that we cannot take b = 2k for any non-negative integer
k (or else F (1, 0) ∈ Z∗SF ). Choosing b = 3 (we can obtain a similar result for any
fixed b 6= 2k ), we prove the following
Theorem 9.1. Let a be an integer such that a2 + 9a + 81 is squarefree and S the
set of primes dividing 2 (a2 + 9a + 81), together with an Archimedean prime. If
neither a nor −a − 9 is in the set {−4, 8, 22, 31}, then the Thue-Mahler equation
(46)
3x3 − ax2 y − (a + 9)xy 2 − 3y 3 ∈ Z∗S
has no solutions in integers. In these exceptional cases, the solutions to (46) are as
follows
±(x, y)
Fa,3 (x, y)
a
−40
(−25, 2), (2, 23), (23, −25)
±1
−31
(−19, 2), (2, 17), (17, −19)
±109
−17
(−5, 1), (1, 4), (4, −5)
±7
(−2, 1), (1, −2), (1, 1)
±1
−5
−4
(−1, −1), (−1, 2), (2, −1)
±1
8
(−4, −1), (−1, 5), (5, −4)
±7
22 (−17, −2), (−2, 19), (19, −17)
±109
31 (−23, −2), (−2, 25), (25, −23)
±1
Applying Theorem 1.1 (since, as we shall see, Fa,3 (x, y) is irreducible in Q[x, y])
immediately yields Theorem 1.2
It is worth noting that the arguments employed in [28], [55] and [56] do not lead
to Theorem 9.1 directly. Indeed, in the course of proving this result, one encounters
Thue inequalities of the shape
(47)
|Fa,b (x, y)| ≤ k,
with k larger than can be handled by direct application of the techniques of, say,
1/2
1/4
[56] (i.e of size ∆Fa,b rather than ∆Fa,b ).
26
MICHAEL A. BENNETT AND SANDER R. DAHMEN
9.1. From Thue-Mahler equations to Thue equations. One property of the
forms Fa,b (x, y), though a simple observation, is key to the proof of Theorem 9.1.
From (44), ∆Fa,b = δ 2 , where δ = a2 + 3ab + 9b2 . We have
Lemma 9.2. Let a and b be coprime integers and p prime with p - 3b. If 3 does
not divide νp (δ), then, for every x, y ∈ Z with Fa,b (x, y) ≡ 0 (mod pνp (δ)+1 ), we
have (x, y) ≡ (0, 0) (mod p).
Proof. Using the fact that p - b and the homogeneity of Fa,b , it obviously suffices
to prove that f (x) = Fa,b (x, 1) ≡ 0 (mod pνp (δ)+1 ) has no solution with x ∈ Z. To
see this, define h(x) = f 00 (x)/2 = −a + 3bx, so that we have the relation
(48)
27b2 f (x) + (2a + 3b + 3h(x))δ = h(x)3 .
Together with the fact that Res(2a + 3b, δ) = 27, if there exists an integer x with
pνp (δ)+1 | f (x), then necessarily pνp (δ)+1 | δ. This proves the lemma.
Remark 9.3. If 3 | νp (δ), then it can easily happen that f (x) has roots in Zp .
Take e.g. b = 3 and a = 848. Then δ = 73 · 13 · 163 and f (x) had three roots in Z7
(in the cubic number field defined by f (x), the prime 7 splits completely).
Remark 9.4. If νp (δ) = 1, then f (x) is a translate of an Eisenstein polynomial at
p (and hence irreducible over Q). Indeed, let t ∈ Z satisfy f (t) ≡ f 0 (t) ≡ 0 (mod p)
and write
f (y + t) = by 3 + h(t)y 2 + f 0 (t)y + f (t).
By (48), we have p | h(t) whereby, from Lemma 9.2, p2 - f (t) and hence f (y + t) is
Eisenstein in y at p.
We will suppose, here and henceforth, that b = 3 and, since
Fa,3 (x, y) = F−a−9,3 (−y, −x)
that, without loss of generality, a ≥ −4. We will also assume a2 + 9a + 81 to be
squarefree. Lemma 9.2 thus enables us to conclude, for coprime x, y, and p a divisor
of a2 + 9a + 81, that we have
Fa,3 (x, y) 6≡ 0 (mod p2 ).
Since, further, Fa,3 (x, y) ≡ 1 (mod 2), it follows that Fa,3 (x, y) ∈ Z∗S is equivalent
to Fa,3 (x, y) dividing a2 + 9a + 81. In other words, our family of Thue-Mahler
equations has been reduced to a number of families of Thue equations. Theorem
9.1 will now follow from resolving the equations
(49)
3x3 − ax2 y − (a + 9)xy 2 − 3y 3 = k
where k | a2 + 9a + 81.
Before we proceed, it is worth noting that the restriction to values of a for which
a2 + 9a + 81 is squarefree is not too severe. Indeed, as a simple application of
2
sieving, together with the observation
that
a + 9a + 81 ≡ 0 (mod p) is solvable
precisely when the Legendre symbol
−3
p
= 1, we have
2
1 lim
1 ≤ a ≤ n : a2 + 9a + 81 is squarefree =
n→∞ n
3
2
Y
p≡1 (mod 3)
2
1− 2 ,
p
where the product is over primes p. In particular, a + 9a + 81 is squarefree for
rather more than 90% of a ≡ ±1 (mod 3).
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
27
9.2. Solutions as convergents. In this subsection, we will show that solutions to
(49) correspond to convergents in the infinite simple continued fraction expansions
to the roots of Fa,3 (x, 1) = 0. To simplify our subsequent arguments somewhat, we
first apply standard computer algebra packages (either Pari/GP or Magma have
routines for solving Thue equations) to treat small values of a:
Proposition 9.5. If −4 ≤ a ≤ 5000 and there exist integers x, y and k satisfying
(49) with k ≥ 1 and k | a2 + 9a + 81, then we have a, x, y and k as follows
a
−4
8
22
31
56
k
1
7
109
1
3721
(x, y)
(−1, −1), (−1, 2), (2, −1)
(−4, −1), (−1, 5), (5, −4)
(−17, −2), (−2, 19), (19, −17)
(−23, −2), (−2, 25), (25, −23)
(−13, −1), (−1, 14), (14, −13)
Remark 9.6. The computation indicated here takes a surprisingly long time in
Magma. A (much) faster way to carry out such a calculation, at least for a of
moderate size, would be to appeal to Corollary 9.10 of subsection 9.3.
We will assume, from now on, that a > 5000 and that x and y are coprime
nonzero integers with
(50)
y > 0, −y/2 < x < y and xy(x + y) 6≡ 0 (mod 3).
Equivalently, (x, y) are of the form
(x, y) = (y − 3j, y), j = 1, . . . ,
y−1
, y ≡ ±1 (mod 3), gcd(y, j) = 1.
2
To see that this is without loss of generality, note that, for a cubic form F , we have
F (−x, −y) = −F (x, y), whereby we may assume y > 0. Appealing to (45) and
noting that Fa,3 (1, 1) = −2a − 9 fails to divide a2 + 9a + 81 for a coprime to 3,
leads to the desired inequalities (this essentially follows along the lines of Lemma
10 (b) of [28]). Since we suppose that x, y and a satisfy (49) where k | a2 + 9a + 81,
necessarily xy(x + y) is coprime to 3 and thus
x x x |k|
(51)
θ1 − y θ2 − y θ3 − y = 3y 3 ,
where θ1 < θ2 < θ3 are roots of Fa,3 (x, 1) = 0. Via Newton’s method, we have, for
a ≥ 11,
−1 − 3/a < θ1 < −1,
−3/a < θ2 < −3/a + 18/a2 ,
and
a/3 + 1 < θ3 < a/3 + 1 + 4/a.
We will actually study θ2 much more carefully later. For our purposes, however,
since (50) implies that −1/2 < x/y < 1, the above inequalities are enough to yield
x 2 |k|
(52)
θ2 − y < a y 3 .
28
MICHAEL A. BENNETT AND SANDER R. DAHMEN
Our goal at this stage is to show that x/y is a convergent in the continued fraction
expansion to θ2 . From classical theory, this is certainly the case if we have
θ2 − x < 1 .
y 2y 2
If we have, say, |k| ≤ 10a, then it follows that x/y is a convergent to θ2 if y ≥
40. For the values of y < 40, there are precisely 171 corresponding pairs (x, y)
satisfying (50). For each such pair, Fa,3 (x, y) is a linear polynomial in a with
integer coefficients; computing its resultant with a2 + 9a + 81, we find that
Resa (Fa,3 (x, y), a2 + 9a + 81) = 35 R3 ,
for some positive integer R = R(x, y). We thus have that Fa,3 (x, y) divides R (again
using that a2 + 9a + 81 is squarefree). For the cases of x, y under consideration,
since we assume a ≥ −4, this can occur only for
Fa,3 (x, y)
(x, y)
(−1, 5)
20a − 153
(−2, 19) 646a − 14103
(−2, 25) 1150a − 35649
R
7
109
193
where Fa,3 (x, y) divides R for a = 8, 22 and 31, respectively.
Let us now suppose that |k| > 10a, with a > 5000 and y ≥ 40. To handle these
larger values of k requires a new approach. We appeal to (a special case of) work
of Hoshi and Miyake [24] (see also [35]):
Theorem 9.7. (Theorem 5.4 of [24]) If m and n are rational numbers, then the
splitting fields over Q of the polynomials x3 − mx2 − (m + 3)x − 1 and x3 − nx2 −
(n + 3)x − 1 coincide precisely when there exists z ∈ Q such that either
n=
m(z 3 − 3z − 1) − 9z(z + 1)
m(z 3 + 3z 2 − 1) + 3(z 3 − 3z − 1)
or n = −
.
3
2
mz(z + 1) + z + 3z − 1
mz(z + 1) + z 3 + 3z 2 − 1
Writing m = a/3 and n = a1 /3 for integers a and a1 , and putting z = y/x for x
and y coprime integers, we have
Corollary 9.8. If a and a1 are integers, then the splitting fields over Q of the
polynomials Fa,3 (x, 1) and Fa1 ,3 (x, 1) coincide precisely when there exist coprime
integers x and y such that
a1 = a +
(a2 + 9a + 81)xy(x + y)
(a2 + 9a + 81)xy(x + y)
or − a1 − 9 = a +
.
Fa,3 (x, y)
Fa,3 (x, y)
Since, for a putative solution to equation (46), Fa,3 (x, y) divides a2 + 9a + 81, it
follows that
(a2 + 9a + 81) x y (x + y)
a1 = a +
Fa,3 (x, y)
2
is an integer. From the fact that a + 9a + 81 is assumed to be squarefree, we have
that the discriminant of the associated cubic field is also (a2 + 9a + 81)2 . This
is a consequence of the fact that every prime dividing a squarefree a2 + 9a + 81
necessarily ramifies in this field. We may thus conclude that
a2 + 9a + 81 | a21 + 9a1 + 81,
whereby
a2 + 9a + 81 | (a − a1 )(a + a1 + 9).
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
29
Writing a2 + 9a + 81 = Fa,3 (x, y) · M and noting that gcd(Fa,3 (x, y), xy(x + y))
divides 3 (and hence is equal to 1), we thus have
(53)
2a + 9 + M xy(x + y) ≡ 0 (mod Fa,3 (x, y)).
Since xy(x + y) is even, the left hand side of (53) is necessarily nonzero, and so
(54)
|2a + 9 + M xy(x + y)| ≥ |Fa,3 (x, y)| .
We will use this inequality to show that, in this case as well, x/y is a convergent to
θ2 .
To begin, notice that if x > 0, then
|Fa,3 (x, y)| = ax2 y + (a + 9)xy 2 + 3(y 3 − x3 ) > axy(x + y),
while
a2 + 9a + 81
xy(x + y),
10a
contradicting (54) and the assumption a > 5000. We may thus assume the inequality −y/2 < x < 0, so that
|M xy(x + y)| <
(55)
|Fa,3 (x, y)| ≥ a|x|y 2 − ax2 y − 3(y 3 − x3 ) > a|x|y 2 − ax2 y − 27y 3 /8.
We will treat small and large values of |x| separately. If |x| ≤ 54, then either
y ≤ 20|x| (which leads to a finite set of pairs (x, y) which we treat as previously;
we find no new solutions to (49)) or we have y > 20|x| and hence, with the first
inequality in (55),
19
|Fa,3 (x, y)| >
a|x|y 2 − 3(y 3 − x3 ).
20
If we further suppose that y < a|x|/4, then −1/2 < x/y < −4/a and so, via our
inequalities upon the θi , Fa,3 (x, y) is positive. We have
1
3
Fa,3 (x, y) > a|x|y 2 − 3|x|3 >
ay 2 ,
5
16
where we have used that y ≥ 40, −54 < x < 0 and a > 5000. It follows from (54)
that
3
max {2a + 9, |M ||x|y(x + y)} ≥
ay 2
16
and hence
3
ay 2 ≤ |M ||x|y(x + y) ≤ 54 |M | y 2 ,
16
i.e. |M | ≥ a/288. Recalling that Fa,3 (x, y)M divides a2 + 9a + 81, we thus have
a
16(a2 + 9a + 81)
≤ |M | <
,
288
3a y 2
contradicting y ≥ 40 and a > 5000.
For the values of x with −54 < x < 0, we may thus suppose that y ≥ a|x|/4. For
x < 0 fixed, it is a routine exercise in calculus to show that Fa,3 (x, y) is monotone
decreasing as a function of y in the interval [a|x|/4, ∞). Writing x = −x0 where
x0 ∈ {1, 2, 4, . . . , 53} and setting a = 3a0 + i for i ∈ {1, 2},
Fa,3 (−x0 , a0 x0 + s) = c2 a20 + c1 a0 + c0 ,
where
c2 = (6 + i)x30 − 3x20 s, c1 = −ix30 + (15s + 2is)x20 − 6s2 x0
and
c0 = −3x30 − isx20 + (is2 + 9s2 )x0 − 3s3 .
30
MICHAEL A. BENNETT AND SANDER R. DAHMEN
If x0 ≥ 4, it is easy to check that
Fa,3 (−x0 , a0 x0 + 2x0 + [ix0 /3]) > a2 + 9a + 81
and
Fa,3 (−x0 , a0 x0 + 2x0 + [ix0 /3] + 1) < −(a2 + 9a + 81),
and hence we may assume that x0 ∈ {1, 2}. In case x0 = 2, we have that
Fa,3 (−2, 2a0 + i + 2) = (24 − 4i)a20 + (−4i2 + 20i + 72)a0 + (−i3 + 4i2 + 36i + 24),
and
Fa,3 (−2, 2a0 + i + 5) = (−4i − 12)a20 + (−4i2 − 28i)a0 + (−i3 − 11i2 − 15i + 51),
and so
Fa,3 (−2, 2a0 + i + 2) > a2 + 9a + 81
and
Fa,3 (−2, 2a0 + i + 5) < −(a2 + 9a + 81).
We have
Fa,3 (−2, 2a0 + i + 3) = (12 − 4i)a20 + (−4i2 + 4i + 72)a0 + (−i3 − i2 + 33i + 57)
and
Fa,3 (−2, 2a0 + i + 4) = −4ia20 + (−4i2 − 12i + 48)a0 + (−i3 − 6i2 + 16i + 72),
where, since x and y are coprime, we necessarily have i = 2 and i = 1, respectively.
Computing
Resa0 (4a20 + 64a0 + 111, 9a20 + 39a0 + 103) = 1093
and
Resa0 (4a20 − 32a0 − 81, 9a20 + 33a0 + 91) = 1093 ,
it follows that 4a20 + 64a0 + 111 or 4a20 − 32a0 − 81 divides 109, a contradiction.
In case x0 = 1, arguing as before we find that
|Fa,3 (−1, a0 + j)| > a2 + 9a + 81,
provided j ≤ −1 or j ≥ 6. We compute, for each j ∈ {0, 1, 2, 3, 4, 5},
Resa0 ((6 + i − 3j)a20 − (j 2 − (15 + 2i)j + i)a0 + (9 + i)j 2 − ij − 3, (3a0 + i)2 ).
In each case, we find this to be of the form T 3 where T ≤ 91, again contradicting
a = 3a0 + i > 5000.
We may thus assume |x| ≥ 54. It follows from (53) that
|M xy(x + y)| ≥ |Fa,3 (x, y)| − 2a − 9 > a|x|y 2 − ax2 y − 27y 3 /8 − 2a − 9
and so
|M xy(x + y)| ≥ a|x|y 2 /2 − 27y 3 /8 − 2a − 9.
If y < 2|x|a/27 then a|x|y 2 /2 − 27y 3 /8 > a|x|y 2 /4 and so
|M xy(x + y)| ≥ a|x|y 2 /4 − 2a − 9 > a|x|y 2 /5,
since y > 40. On the other hand, |M | < a/10 + 9 + 81/a, whereby
|M xy(x + y)| < a|x|y 2 /9.
We thus have y ≥ 2|x|a/27 ≥ 4a and so x/y is a convergent to θ2 .
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
31
9.3. Applying the method of Thue-Siegel. Our work in the preceding subsection led to the conclusion that x/y is a convergent in the continued fraction
expansion to θ2 . To reduce this to a finite problem, we appeal to an irrationality
measure for θ2 , derived from the method of Thue-Siegel:
Theorem 9.9. (Theorem 2.9 of [56]) Let a and b be positive integers with a ≥ 31b4
and suppose that Fa,b (θ, 1) = 0 with −1 < θ < 0. Then if p and q are integers with
q≥
a+ 23 b
9.04 ,
we have
θ −
where
λ=
and
p 1
>
,
q
c(a, b)q 1+λ
√
log a2 + 3ab + 9b2 + 0.83
<2
log a + 23 b − 2 log b − 1.3
λ
p
2.47
c(a, b) = 17.43 a2 + 3ab + 9b2
.
b2
With care, we can improve this result slightly, but it is adequate for our purposes.
It implies the following.
Corollary 9.10. (Theorem 2.10 of [56]) If a and b are positive integers with a ≥
31 b4 , k is a positive integer, and ifx and y are nonzero
integers satisfying (47),
q
a+ 23 b 3
ak
with −1/2 < x/y ≤ 1 and y ≥ max 9.04 , 1.99b2 , then
y 2−λ < 18.34
2.47
b2
λ
k,
where λ is as in Theorem 9.9.
Applying Corollary 9.10 with k ≤ a2 + 9a + 81, together with the fact that λ
in Theorem 9.9 is decreasing monotonically for suitably large values of a, we may
readily compute that
y < a15
if a > 5000,
y < a8.6 if a > 104 ,
y < a4.6 if a > 105 ,
y < a3.6 if a > 106 .
9.4. Continued fraction expansions to θ2 . From our preceding arguments,
x/y = pn /qn for some convergent in the simple continued fraction expansion to
θ2 , where qn is bounded above by the upper bound for y at the end of the previous
subsection. For 5000 < a ≤ 106 , we compute the continued fraction expansion for
θ2 using Pari/GP and verify that Fa,3 (pn , qn ) fails to divide a2 + 9a + 81 in each
case. To treat the remaining values of a > 106 (and hence qn < a3.6 ), we begin by
noting that we can explicitly compute the first few terms, in the simple continued
fraction expansion
θ2 = [a0 ; a1 , a2 , . . .] .
Specifically, applying Newton’s method, we find that, at least for a ≥ 86,

, 2, 1, a−34
if a ≡ 1 (mod 3),
 −1, 1, a+2
3
54
(a0 , a1 , a2 , a3 , a4 , a5 ) =

, 1, 2, a−14
if a ≡ 2 (mod 3).
−1, 1, a+2
3
54
32
MICHAEL A. BENNETT AND SANDER R. DAHMEN
Here [x] denotes the greatest integer ≤ x. It follows that the first few convergents
pn /qn to θ2 are given by
1 0
1
2
3
, −
a+2 , −
a+2 − , ,−
1 1 1 + a+2
3
+
2
4
+
3
3
3
3
and
1 0
1
3
1
, −
a+2 , −
a+2 ,
− , ,−
1 1 1 + a+2
2
+
5
+
3
3
3
3
for a ≡ 1 (mod 3) and a ≡ 2 (mod 3), respectively.
If we suppose that x/y is a convergent to θ2 for which Fa,3 (x, y) divides a2 +
9a + 81, then, since this latter quantity is squarefree and coprime to 3, we have
that both (x, y) = ±(pn , qn ) for some n, whereby Fa,3 (pn , qn ) divides a2 + 9a + 81,
and that
(56)
pn qn (pn + qn ) 6≡ 0 (mod 3).
We note that condition (56) is not satisfied for n ∈ {0, 1, 4}. Writing a = 3j + s for
s ∈ {1, 2}, we have that
F3j+s,3 (p2 , q2 ) = F3j+s,3 (−1, j + 2) = sj 2 + (3s + 6)j + 2s + 9
and hence Fa,3 (p2 , q2 ) | a2 + 9a + 81 precisely when
sj 2 + (3s + 6)j + 2s + 9 divides 9j 2 + (6s + 27)j + s2 + 9s + 81.
A short calculation shows that this does not occur. Specifically, we have that
(13 − 4s) F3j+1,3 (p2 , q2 ) > a2 + 9a + 81
and hence necessarily a2 + 9a + 81 = M F3j+1,3 (p2 , q2 ) for some integer M with
1 ≤ M ≤ 12 − 4s, whereby
(9 − M s) j 2 + (6s + 27 − (3s + 6)M ) j + s2 + 9s + 81 − (2s + 9) M = 0.
This equation has no rational roots for the given choices of M and s. Since
F3j+1,3 (p3 , q3 ) = F3j+1,3 (−2, 2j + 5) = −4j 2 + 32j + 81
and
F3j+2,3 (p3 , q3 ) = F3j+1,3 (−1, j + 3) = −j 2 + j + 9,
we can argue similarly to conclude that x/y = pn /qn with n ≥ 5. To show that, in
fact, n ≥ 6, note that

−3[ a−34
54 ]−2


if a ≡ 1 (mod 3),
a−34
a+2
a−34

3[ 54 ][ 3 ]+2[ a+2

3 ]+4[ 54 ]+3
p5
=

q5

−3[ a−14

54 ]−1

if a ≡ 2 (mod 3).
a+2
a+2
a−14
3[ a−14
+
][
]
[
54
3
3 ]+5[ 54 ]+2
Writing a = 54k + t for k ∈ Z, 1 ≤ t ≤ 53 and gcd(t, 3) = 1, and arguing as before,
we find that Fa,3 (p5 , q5 ) fails to divide a2 + 9a + 81, in every case. It follows that
y > q5 and hence, since we assume a > 106 , y > a2 /55.
To finish our proof, we need to handle values of y satisfying
1 2
(57)
a < y < a3.6 for a > 106 .
55
If we continue further with our examination of the infinite simple continued fraction
for θ2 , perhaps unsurprisingly, complications arise. Indeed, the next few partial
quotients, for suitably large a, depend on the value of a modulo 54. Specifically,
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
33
we have that if a ≡ t (mod 54) with 1 ≤ t ≤ 53 and gcd(t, 3) = 1, then the
sequence ah6 , a7 , i. . . begins with a sequence αt of terms a6 , a7 , . . . , ak(t) , followed by
ak(t)+1 =
a−βt
3402
t
1
2
4
5
7
8
10
11
13
14
16
17
19
20
22
23
25
26
, where
αt
2, 3, 2, 1, 3, 1
1, 3, 1, 2, 3, 2
2, 26, 1, 1
1, 5, 2, 1, 4, 1
1, 1, 4, 1, 8, 1
1, 8, 1, 4, 1, 1
1, 1, 1, 1, 20, 1
1, 20, 1, 1, 1, 1
1, 1, 1, 11, 2, 1
107, 1
1, 2, 2, 15
15, 2, 2, 1
1, 3, 3, 8
8, 3, 3, 1
1, 4, 1, 2, 5, 1
5, 1, 2, 6
1, 7, 3, 4
4, 3, 7, 1
βt
2701
1514
1732
2813
3085
1898
3250
2063
2335
3416
232
2447
451
2612
2884
563
835
2996
t
28
29
31
32
34
35
37
38
40
41
43
44
46
47
49
50
52
53
αt
1, 14, 2, 3
3, 2, 14, 1
1, 107
2, 1, 11, 3
21, 1, 1, 2
2, 1, 1, 21
9, 1, 4, 2
2, 4, 1, 9
6, 2, 1, 5
1, 1, 26, 2
4, 1, 2, 3, 1, 1
1, 1, 3, 2, 1, 4
3, 1, 2, 1, 1, 1, 1, 1
1, 1, 1, 1, 1, 2, 1, 3
3, 11, 1, 2
1, 2, 11, 1, 1, 1
2, 1, 1, 1, 2, 1, 2, 1
1, 2, 1, 2, 1, 1, 1, 2
βt
1000
3215
31
1112
1384
197
1549
362
634
1715
1933
746
2152
911
1183
2264
2536
1295
These are valid for a ≥ 6818. It is worth noting here that if t ≡ 1, 7 (mod 9) then
αt is just the reverse of αt+1 , while if t ≡ ±4 (mod 9), the same is true of αt and
αt∓17 .
Even this knowledge leads us only, after much work, to the conclusion that
y a3 . Indeed, we find ourselves confronted with rather dramatic combinatorial
explosion. One way to overcome these technical difficulties is to appeal to an
argument of Wakabayashi, introduced in [54]. Instead of relying upon a continued
fraction expansion to θ2 with integer partial quotients, following Wakabayashi (see
section 8 of [56] for details), we compute one with rational partial quotients. Let
us suppose that, for a given real number ψ, we choose a rational numbers k0 such
that k0 < ψ < k0 + 1, and define recursively for i ≥ 1, ψi = 1/(ψi−1 − ki−1 ), where
ψ0 = ψ and, in each case, rational ki are chosen with ki < ψi < ki + 1. We call
ψ = [k0 ; k1 , k2 , . . .]
a continued fraction expansion with rational partial quotients for ψ. If we further
define convergents pn /qn via
p0 = 1, p1 = k0 , pn+1 = kn pn + pn−1 (n ≥ 1),
q0 = 0, q1 = 1,
qn+1 = kn qn + qn−1 (n ≥ 1),
then if kn ≥ 1 for n ≥ 1, necessarily qn → ∞ and pn /qn converges to ψ. We have
the following
Proposition 9.11. Let ψ be a real number and pn /qn (n = 1, 2, . . .) be the convergents defined by a continued fraction expansion with rational partial quotients for
ψ. Suppose, for n ≥ 0, that dn is a positive rational number with the property that
34
MICHAEL A. BENNETT AND SANDER R. DAHMEN
dn pn and dn qn are integers. If, for a given positive integer n, there exist integers p
and q satisfying
qn
qn+1
(58)
≤q<
dn−1
dn
and
ψ −
(59)
p 1
<
,
q dn (dn−1 + dn+1 ) q 2
then we may conclude that p/q = pn /qn .
Proof. This is Theorem 5 of [54], together with the observation that we may choose
positive rational numbers dn (rather than positive integers, as Wakabayashi does)
with the property that dn pn and dn qn are integers, and reach an identical conclusion.
In our case, we may take θ2 = [k0 ; k1 , k2 , . . .], where
k0 = −1, k1 = 1, k2 =
and k6 =
384a
1001
+
1728
1001 .
a
3
+ 1, k3 =
a
6
+ 34 , k4 =
8a
21
+
12
7 ,
k5 =
49a
288
2
4a
7
3
7a2
48
−
143a
144
4
− 52 , p7 = − 16a
3861 −
q0 = 0, q1 = 1, q2 = 1, q3 =
q5 =
4a3
189
+
20a2
63
q7 =
49
64
Corresponding convergents are
p0 = 1, p1 = −1, p2 = 0, p3 = −1, p4 = − a6 − 34 , p5 = − 4a
63 −
p6 = − 7a
648 −
+
+
16a5
11583
16a
7
+
+
44
7 ,
128a4
3861
+
q6 =
a
3
+
−
+
91a3
1296
3616a2
1287
−
464a
143
+
7a
12
+ 52 ,
11a2
16
+
245a
72
1568a
143
+
208
11 .
+
+
896a2
1287
a2
18
+ 2, q4 =
7a4
1944
1568a3
3861
32a3
429
+
−
−
16
7 ,
944
143 ,
117
16 ,
We may choose
d0 = 1, d1 = 1, d2 = 1, d3 = 3, d4 = 36, d5 =
189
11583
, d6 = 3888, d7 =
.
4
16
Note that from (57), we have
q3
q7
<y< .
d2
d6
We will appeal to Proposition 9.11 with ψ = θ2 and n = 3, 4, 5 and 6. Let us begin
by observing that the inequality
1
θ2 − x <
(60)
y dn (dn−1 + dn+1 ) y 2
is a consequence of (52), (57) and the fact that a > 106 , at least for n = 3 and 4. We
may thus apply Proposition 9.11 to conclude that either x/y = p3 /q3 , x/y = p4 /q4 ,
or y ≥ q5 /d4 . In the first case, we have that y = a + 6, contradicting (57). In the
second,
−6a − 27
x
= 2
y
2a + 21a + 90
contrary to xy(x + y) 6≡ 0 (mod 3). We thus have
y≥
a3
5a2
4a 11
+
+
+
1701 567 63 63
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
35
which, with (52), implies inequality (60) for n = 5 and 6. From Proposition 9.11,
we conclude that x/y = p5 /q5 or x/y = p6 /q6 , i.e. that
x
−3a2 − 27a − 108
−42a3 − 567a2 − 3861a − 9720
= 3
or
.
y
a + 15a2 + 108a + 297
14a4 + 273a3 + 2673a2 + 13230a + 28431
Again, this contradicts xy(x + y) 6≡ 0 (mod 3), completing the proof of Theorem
9.1.
10. A quartic family
It appears to be somewhat harder to find a convenient family of Klein forms of
index 3, since families for which corresponding Thue inequalities have been treated
in the literature fail to satisfy (7). We will consider instead (Klein) forms of the
shape
Fa (x, y) = 2x4 + 3x3 y − 3c(a) x2 y 2 + 3d(a) xy 3 − 3e(a) y 4 ,
where
2 c(a) = 18a4 + 204a3 + 867a2 + 1595a + 1038,
4 d(a) = 72a6 + 1224a5 + 8643a4 + 32106a3 + 65399a2 + 68268a + 28332
and
8 e(a) = 81a8 + 1836a7 + 18153a6 + 101871a5 + 353472a4
+773229a3 + 1036930a2 + 776604a + 248112.
Via Eisenstein’s criterion at the prime 3, Fa (x, y) is irreducible in Q[x, y], at least
provided a ≡ ±1 (mod 3). Writing κa = (2a + 3)(12a2 + 84a + 163), we prove
Theorem 10.1. Let a ≡ 17, 37 (mod 60) be an integer and suppose that κa is
squarefree. If, further, κa has no prime divisors congruent to 3, 17, 27 or 33 modulo
40, then the equation
Fa (x, y) = z n
has at most finitely many solutions in coprime integers x and y, and integers z and
n ≥ 3.
Proof. Assume henceforth that κa is squarefree (an old result of Erdős [17] ensures
that this occurs for infinitely many values of a). We begin by noting that the
family of forms Fa (x, y) has a number of properties reminiscent of our cubic family.
The discriminant of Fa (x, y), for instance, satisfies ∆Fa = −33 κ6a , while we have
corresponding Hessian Ha (x, y) = κa Ha∗ (x, y), where
Ha∗ (x, y) = −a1 (a)x4 + 6b1 (a)x3 y − 3c1 (a)x2 y 2 + 3d1 (a)xy 3 − 3e1 (a)y 4 ,
with
a1 (a) = 6a + 17, b1 (a) = 6a3 + 51a2 + 143a + 118,
2c1 (a) = 54a5 + 765a4 + 4308a3 + 11925a2 + 16028a + 8292,
2d1 (a) = 54a7 +1071a6 +9063a5 +42159a4 +115659a3 +185702a2 +160308a+57096
and
16e1 (a) = 162a9 + 4131a8 + 46656a7 + 304002a6 + 1249038a5
+3322347a4 + 5648376a3 + 5820520a2 + 3229248a + 711216.
We note that, for every a ∈ Z, the coefficients of Fa (x, y) and Ha∗ (x, y) actually lie
in Z.
A straightforward calculation yields, if Ha∗ (x, y) is even, that necessarily Ha∗ (x, y)
is divisible by 25 . Since, in such a case, we have both δ3 and Fa (x, y) odd, we
36
MICHAEL A. BENNETT AND SANDER R. DAHMEN
may conclude via (30) that ν2 (N (Ẽx,y )) > 0. Theorem 8.4 thus implies the desired result, unless there exist integers x and y for which either F (x, y) ∈ Z∗SFa or
Ha∗ (x, y) ∈ Z∗SFa .
To treat these equations, analogous to Lemma 9.2, we require information about
the lifting of roots of Fa (x, 1) modulo p.
Lemma 10.2. Let a ∈ Z, p > 3 be a prime and suppose that νp (κa ) = 1. Then,
for every pair of coprime integers x and y, we have
νp (Fa (x, y)) and νp (Ha∗ (x, y)) ∈ {0, 2}.
Proof. Let F (x, y) be any quartic Klein form as in (6), with αi ∈ Z and p - α0 .
Assume further that νp (δ3 ) = 3. Then the discriminant of F is divisible by p and
so F has a repeated factor over Fp . From (7), we may readily conclude that this
factor is not an irreducible quadratic over Fp . It follows that F has a root modulo
p. After a suitable linear transformation, we may assume that p divides α3 and α4 .
Repeatedly appealing to (7) and our assumption that
p3 | 27δ3 = 72α0 α2 α4 + 9α2 α3 α4 − 27α0 α32 − 27α4 α12 − 2α23 ,
we find that either p - α1 , in which case p3 | α4 , p2 | α3 , p | α2 and F has a
simple root modulo p, or p | α1 , whence p2 | α4 , p2 | α3 and p | α2 . In the latter
case, any root mod p is automatically a root mod p2 . Such a root cannot lift to
a root modulo p3 , since this would imply p3 | α4 and p2 | α2 , and hence p4 | δ3 ,
contradicting νp (δ3 ) = 3.
To apply this to the situation at hand, first of all note that p - Fa (1, 0) and,
by a resultant computation, also p - Ha∗ (1, 0). Next, it is a relatively easy matter
to check that all primes lying above p in the field defined by Fa (x, 1) (and hence
the field defined by Ha∗ (x, 1)) ramify (but not necessarily completely). This implies
that Fa (x, 1) and Ha∗ (x, 1) have no root in Zp , and hence no simple roots modulo
p. We are thus in the second case of the preceding paragraph, whereby the lemma
follows.
Suppose, from now on, that a ≡ ±1 (mod 3) (so that δ3 is coprime to 3). Then
νp (Fa (x, y)) ∈ {0, 2}
for every coprime x, y, and each p | κa . Since ν3 (Fa (x, y)) ∈ {0, 1}, it follows that
if Fa (x, y) is an SFa unit, necessarily
Fa (x, y) = ±3δ z 2
for δ ∈ {0, 1} and z an odd integer, coprime to 3. Assuming, further, that a ≡
1 (mod 4), then Fa (x, y) is either even or Fa (x, y) ≡ 1 (mod 8), whereby Fa (x, y) =
z 2 for some z, coprime to 6. In particular, it follows that Fa (x, y) ≡ 1 (mod 3),
contradicting Fa (x, y) ≡ 2x4 (mod 3).
Let us next suppose that Ha (x, y) is an SFa unit, whereby the same is necessarily
the case for Ha∗ (x, y). Again, we have that νp (Ha∗ (x, y)) ∈ {0, 2} for every coprime
x, y, and each p | κa , and that ν3 (Ha∗ (x, y)) ∈ {0, 1}. Since a short calculation
implies Ha∗ (x, y) ≡ 1 (mod 8), it follows that Ha∗ (x, y) = z 2 for some z dividing κa .
Now suppose that p is a prime dividing κa . The proof of Lemma 10.2 shows that
we have
Ha∗ (x, y) ≡ (−6a − 17)(x + my)4 (mod p),
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
37
for some integer m. It follows from Ha∗ (x, y) = z 2 that either p | z or −6a − 17 is a
quadratic residue modulo p. If p | 2a + 3 then
−6a − 17
−8
−2
=
=
,
p
p
p
and hence −6a−17 is a quadratic residue modulo p precisely when p ≡ 1, 3 (mod 8).
If p | 12a2 + 84a + 163, then, since the discriminant of this quadratic is −28 · 3,
there exists an integer k such that
7 2k
k 2 ≡ −3 (mod p) and a ≡ − ±
(mod p).
2
3
Thus −6a − 17 ≡ 4 ± 4k (mod p), whereby, again,
3 1±k
1±k
(1 ± k)3
−8
−2
−6a − 17
=
=
=
=
=
.
p
p
p
p
p
p
We conclude that, if p | κa and p ≡ 5 or 7 modulo 8, then necessarily p | z.
Restricting attention to a ≡ 2 (mod 5), it is easy to see that Ha∗ (x, y) cannot be
congruent to −1 modulo 5 and that κa ≡ 3 (mod 5). Since, by hypothesis, every
prime divisor of κa congruent to 1 or 3 (mod 8) is also ≡ ±1 (mod 5), we thus
conclude that z ≡ ±2 (mod 5), contradicting Ha∗ (x, y) ≡ 1 (mod 5).
It is likely that the restrictions we impose here upon the prime divisors of κa are
unnecessary. Indeed, provided only that κa squarefree and a is coprime to 3, we
would expect the Thue equations associated to Fa (x, y) and Ha∗ (x, y) to have no
solutions, with perhaps finitely many exceptions. We may check with Pari GP or
Magma that this is the case for all a ∈ Z with 3 - a and |a| ≤ 50, except if a = −2
(where the associated form satisfies F−2 (0, 1) = 3).
Remark 10.3. It is actually possible to handle this anomalous case a = −2 as
follows. We have
F−2 (x, y) = 2x4 + 3x3 y + 42x2 y 2 + 204xy 3 + 3y 4
and, in the notation of Theorem 8.4, we can find a suitable twist by t such that
N1 = 3α · 432 , for α ∈ {2, 3}. For all elliptic curves E with conductor N1 , the
(t)
equation j̃(x, y) = jE has no rational roots, and hence the equality N (Ẽx1 ,y1 ) = N1
cannot be satisfied (the Thue-Mahler equation F̃ (x1 , y1 ) ∈ Z∗S̃ has no solutions).
F
On the other hand, the equality j(x, y) = jE actually does have solutions,
(t)
whereby there exist coprime integers x1 and y1 for which Ex1 ,y1 has conductor
N1 . These elliptic curves correspond to the isogeny class 16641e1 in Cremona’s notation, with j-invariant 218 · 33 · 53 . Since these curves have complex multiplication,
we may argue as we will do in Section 13 to show that the equation F−2 (x, y) = z l
has no solutions for suitably large prime l ≡ 1 (mod 3). We can, however, do rather
better by examining images of inertia. Let E be an elliptic curve in isogeny class
16641e1, whereby a twist of E by −3 has good reduction at 3. This means that,
for l > 3 we have #ρE
l (I3 ) = 2, where I3 ⊂ Gal(Q/Q) denotes an inertia subgroup
at 3. Conversely, from the explicit formulae for ∆(x, y) and c4 (x, y), it is a simple
matter to check that E0 has no quadratic twist with good reduction at 3, whereby
0
#ρE
l (I3 ) > 2. Theorem 8.4 now tells us that the generalized superelliptic equation
in question has no solutions for any suitably large prime exponent l.
38
MICHAEL A. BENNETT AND SANDER R. DAHMEN
Remark 10.4. Though it is very likely that the hypotheses of Theorem 10.1 are
satisfied for infinitely many a, this does not appear to follow in a straightforward
fashion from, say, application of the half-dimensional sieve to the polynomial κa .
It appears that, with a certain amount of work, it should be possible to generalize
the family Fa (x, y) to one with two parameters and analogous properties, which
would, in all likelihood, be provably infinite.
11. Higher degree families of Klein forms
It is a relatively easy matter to find infinite families of Klein forms of degrees 6
and 12 for which we can guarantee that equation (2) has finitely many solutions in
integers x, y, z and l ≥ 2. We appeal to the following result.
Proposition 11.1. Let F be a Klein form of index n and suppose that p 6∈ SF is
prime. If p has the additional property that F (x, y) ≡ 0 (mod p) for all integers
x and y, then there are at most finitely many solutions to equation (2) in integers
x, y, z and l ≥ 2. In particular, there are no such solutions with prime l satisfying


Y
√
log l > 27 · 34 
q 2  log (1 + p) .
q∈SF \{2,3}
Proof. Suppose that F is a Klein form of index n and that p is a prime, coprime
to n ∆F , with the property that F (x, y) ≡ 0 (mod p) for all integers x and y. It
follows that if x0 , y0 , z0 is a solutions in integers to (2), where l is prime, then
(t)
the Frey-Hellegouarch curve Ex0 ,y0 (with t chosen as usual) has (multiplicative)
bad reduction at p. From (38), either l ≤ 163 or there exists a weight 2 cuspidal
newform f , of trivial character and level coprime to p, for which, provided p 6= l,
we have
ap (f ) ≡ ±(p + 1) (mod L),
where L is a prime lying above l in Kf , the field of definition for the Fourier
coefficients of the form f . From the Weil bounds, we thus have
√ [K :Q]
l ≤ (p + 1 + 2 p) f ,
whereby arguing as in Section 8, we conclude as desired.
This result is much more specialized than it first appears. In fact, the only
pairs (n, p) for which the hypotheses of Proposition 11.1 are satisfied are (n, p) =
(4, 3), (4, 5), and (5, 11). To see this, note first that, since a primitive form F (x, y)
of degree k can have at most k zeros in P1 (Fp ), we necessarily have p ≤ k − 1. To
be precise, we require that
!
p−1
Y
(61)
F (x, y) ≡ xy
(x − iy) G(x, y) (mod p),
i=1
where G(x, y) is a form of degree k − p − 1, with no linear factors over Fp [x, y] (the
latter condition to ensure that p does not divide ∆F ). Since we assume that p 6= n,
this immediately contradicts the existence of primes p satisfying the hypotheses of
Proposition 11.1 for n ∈ {2, 3}. We may thus suppose that F has index 4 or 5 (which
justifies the assumption l ≥ 2 in the statement of Proposition 11.1). It follows that
the only candidates for primes p are p ∈ {3, 5} (if n = 4) and p ∈ {2, 3, 7, 11} (if
n = 5). To rule out (n, p) = (5, 2), (5, 3) and (5, 7), we appeal to (9), in conjunction
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
39
with (61); in each case, we may assume, without loss of generality, that G(x, y) is
monic in x. Suppose that F is a Klein form with coefficients as given in (6). From
(9), we thus have
α5 ≡ α7 ≡ α1 α3 + α2 ≡ α1 α11 + α3 α9 + α6 ≡ 0 (mod 2),
while (61) and our assumptions upon G imply that α0 and α12 are even, while α1
is odd, and hence, from the second equation in (9), α2 α3 + α4 ≡ 0 (mod 2). A
short computation using the fact that G has no linear factors in F2 [x, y], leads to
the conclusion that
9
X
G(x, y) ≡
αi x9−i y i (mod 2)
i=0
where (α0 , . . . , α9 ) is one of
(1, 0, 1, 0, 0, 1, 1, 0, 0, 1), (1, 0, 1, 0, 0, 1, 1, 1, 1, 1), (1, 1, 1, 1, 1, 0, 0, 0, 1, 1)
or (1, 1, 1, 1, 1, 0, 0, 1, 0, 1). In each case, the combination of the third, fifth and
seventh equations in (9) leads to a contradiction, modulo 4.
If (n, p) = (5, 3), since (9) implies that αi ≡ 0 (mod 3) for each i ∈ {4, 5, 7, 8},
whence, via the last equation of (9),
α1 α11 − α2 α10 + α62 ≡ 0 (mod 3),
we find that every octic form G(x, y) for which xy (x2 − y 2 ) G(x, y) satisfies (9)
modulo 3, has a linear factor in F3 [x, y]. In case (n, p) = (5, 7), if we write G(x, y) =
x4 + ax3 y + bx2 y 2 + cxy 3 + dy 3 , then the first 3 equations in (9) imply that
2b + a2 ≡ c + ab ≡ d + ac + 4b2 (mod 7).
A routine check with Magma verifies that G(x, y) satisfying these congruences all
have linear factors modulo 7.
For the remaining pairs (n, p) = (4, 3), (4, 5), and (5, 11), in each case there exist
infinitely many inequivalent forms satisfying the hypotheses of Proposition 11.1.
To see this, for (n, p) = (4, 3), note that we necessarily have
G(x, y) ∈ x2 + y 2 , x2 + xy + 2y 2 , x2 + 2xy + 2y 2 .
For each of these we may check, via (8), that xy (x2 − y 2 ) G(x, y) is a Klein form
modulo 3, which leads to 3 families of sextic forms satisfying the hypotheses of
Proposition 11.1 with (n, p) = (4, 3). By way of example, fixing for simplicity
F (1, 0) = 3, we find the families
(3, b, 15c, 90d)6
(3, b, 5c, 10d)6
(3, b, 5c, 10d)6
with b ≡ 1 (mod 3), c2 − d ≡ 1 (mod 3),
with b ≡ −c ≡ d ≡ 1 (mod 3) and bd − c2 ≡ 3 (mod 9),
with b ≡ c ≡ d ≡ 1 (mod 3) and bd − c2 ≡ −3 (mod 9).
Similarly, xy (x4 − y 4 ), satisfies (8), modulo 5, which provides us with infinitely
many forms with (n, p) = (4, 5), for instance
(5, b, 25c, 250d)6 with b ≡ 1 (mod 5), c2 − d ≡ 1 (mod 5).
Finally, for (n, p) = (5, 11), the fact that xy (x10 − y 10 ) is a Klein form modulo 11
leads to the desired conclusion. An example of a family of forms in this case is
provided by
(11, 12, 22c, 55d)12 with c ≡ 16 (mod 114 ), d ≡ 1 (mod 114 ).
40
MICHAEL A. BENNETT AND SANDER R. DAHMEN
One readily shows that each of these families contain infinitely many GL2 (Q) inequivalent forms. We thus have
Corollary 11.2. If n ∈ {4, 5}, there are infinitely many GL2 (Q)-inequivalent Klein
forms of index n for which equation (2) has at most finitely many solutions in
integers x, y, z and l ≥ 2.
12. Heuristics for cubic forms
As we have seen, there exist infinitely many cubic forms with the property that
equation (3) is insoluble in integers. In this section, we will sketch a heuristic to
indicate that this is the usual state of affairs, that, in fact, a “typical” binary cubic
form F (x, y) ∈ Z[x, y] has this property (and hence that Theorem 1.1 applies to
almost all cubic forms, excepting a set of density zero).
For a cubic form F (x, y), we have F (−x, −y) = −F (x, y) and hence if F represents an integer m, it also necessarily represents −m. We may thus restrict
attention to the representation of positive integers by F . Let us quantify what we
mean by a “typical” form. We begin by noting a result of Davenport [10], [11] (we
can sharpen the error term here by appealing to work of Shintani [45], [46], but
this is unnecessary for our argument):
Theorem 12.1. (Davenport, 1951) Let H3 (A, B) denote the number of GL2 (Z)
equivalence classes of primitive, irreducible binary cubic forms F ∈ Z[x, y], with
A < ∆F ≤ B. Then, as X → ∞,
H3 (0, X) =
5
X + O(X 15/16 )
4π 2
and
H3 (−X, 0) =
15
X + O(X 15/16 ).
4π 2
Note that the seeming discrepancy between this result and that stated in [10]
and [11] derives from a missing factor of 3 in the statement of the main theorem
of [10], together with the fact that the estimates in [10] and [11] are for classes
of properly equivalent, rather than equivalent, forms; i.e. for SL2 (Z) instead of
GL2 (Z) equivalence classes, with, additionally, no restriction to primitive forms.
The assumption here that our forms be irreducible can be relaxed, via application
of Lemma 3 of [10].
With this result in mind, our goal is to derive a heuristic to suggest that the
number of GL2 (Z) equivalence classes of primitive, irreducible binary cubic forms
F ∈ Z[x, y], with −X < ∆F ≤ X, say, for which (3) has integer solutions, which
(3)
we will denote H3 (−X, X), satisfies
(3)
H3 (−X, X) = o(X) as X → ∞,
i.e. such forms are “atypical” in the sense that they have zero density in the set of
all classes of forms.
We begin by noting that we can unconditionally bound the number of integers
up to a given X which are represented by a fixed irreducible binary cubic form.
The following pair of results are the main theorem of Thunder [52] and a special
case of Theorem 1 of Bean [1], respectively.
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
41
Proposition 12.2. Let F (x, y) ∈ Z[x, y] be a cubic form of discriminant ∆F which
is irreducible over Q and let X ≥ 1. Let NF (X) and AF denote the number of
integral solutions to the inequality |F (x, y)| ≤ X, and the area of the region
(x, y) ∈ R2 : |F (x, y) ≤ 1 ,
respectively. Then
2008 X 1/2
+ 3156 X 1/3 .
NF (X) − X 2/3 AF < 9 +
|∆F |1/2
Proposition 12.3. If F (x, y) ∈ Z[x, y] is a cubic form of discriminant ∆F 6= 0,
then
3 Γ(1/3)2
1/6
|∆F | AF ≤
< 16.
Γ(2/3)
Taken together, for suitably large |∆F | and X, these results imply that
NF (X) < X 2/3 ,
and, of particular interest for our purposes, that
(62)
#{1 ≤ n ≤ X : F (x, y) = n for some (x, y) ∈ Z × Z} < X 2/3 .
Suppose next that SF is the set of primes p dividing 2 ∆F . We would like to find
an upper bound for
#{1 ≤ n ≤ X : n ∈ Z∗SF }.
Write ψ(X, Y ) for the number of Y -smooth integers ≤ X, and denote by pt , the
t-th prime. It follows that
#{1 ≤ n ≤ X : n ∈ Z∗SF } ≤ ψ(X, pω(∆F )+1 ),
where ω(m) denotes the number of distinct prime divisors of a positive integer m.
1/16
If X ≥ |∆F |
, say, then
log X
,
ω(∆F ) log log X
where the implied constant is absolute. It follows that pω(∆F )+1 log X and
1/16
hence, for large enough |∆F | and X ≥ |∆F |
, we have
(63)
#{1 ≤ n ≤ X : n ∈ Z∗SF } ≤ ψ(X, log7/6 X) = X 1/7+o(1) X 1/6 .
Readers interested in bounds for ψ(X, Y ) (and much more besides) are directed to
the survey article of Granville [22].
We will make the following heuristic assumption. Let us write mF for the smallest positive integer represented by the form F , i.e.
mF = min {m ∈ N : there exist x, y ∈ Z with F (x, y) = m} .
We will assume that the number of GL2 (Z)-equivalence classes of binary cubic forms
with |∆F | ≤ X and mF < |∆F |1/16 is o(X) as X → ∞. In fact, for our purposes,
all we need is a like statement with 1/16 replaced by any fixed δ > 0. We actually
believe that the same conclusion holds with the exponent 1/16 replaced by any
δ < 1/4. By an old theorem of Mordell [34], this would be the strongest possible
1/4
statement of this nature, as mF ≤ (|∆F |/23) , for every cubic form F . Lemma
4 of Davenport [10] ensures that for all but O(X 15/16 ) classes of cubic forms, as
X → ∞, the Hermite reduced form for F satisfies F (1, 0) > |∆F |1/16 . In many
cases, but not all, mF = F (1, 0) for such forms.
42
MICHAEL A. BENNETT AND SANDER R. DAHMEN
With our heuristic assumption in hand, we appeal to a straightforward density
argument. Let X be large. Then for all but o(X) classes of primitive cubic forms
F (x, y) with |∆F | ≤ X, we have mF ≥ |∆F |1/16 , whereby, from (62) and (63), the
expected number of positive integers that are in
Z∗SF ∩ {F (x, y) : (x, y) ∈ Z × Z}
is, for a given form outside this exceptional set,
Z ∞
X 2/3 X 1/6
·
dX |∆F |−1/96 .
X
|∆F |1/16 X
It follows (where we are of course relying upon the rather speculative independence
of the representation of an integer by a cubic form F from that of said integer being
an SF -unit) that the “probability” that a given non-exceptional form F represents
an SF -unit tends to 0 as |∆F | → ∞.
We can derive similar heuristics (of equal plausibility) for higher degree Klein
forms (whereby our expectation is that there are at most finitely many solutions to
equation (2) in integers x, y, z and l ≥ max{2, 7 − k}, for “almost all” Klein forms
of degrees k = 4, 6 and 12). For k > 3, however, Klein forms constitute a density
zero subset of the set of all forms of degree k.
13. Diagonal forms
In this section, we will turn our attention to what are, in some sense, the simplest
binary forms. We call a binary form diagonal if it is GL2 (Z) equivalent to a form
of the shape F (x, y) = axk + by k , for integers a and b. It is clear from equations
(7), (8) and (9) that the only diagonal Klein forms are of degree k = 3. Since such
forms have discriminant
2
∆F = −33 (ab) ,
the conditions of Theorem 1.1 are never satisfied (F (1, 0) divides ∆F , for instance).
Despite this, we can still appeal to Theorem 8.4 to deduce some useful Diophantine
information.
We begin by noting that our conductor calculations become rather more straightforward for diagonal cubic forms; for simplicity, we will initially restrict our attention to those forms with odd coefficients. Let U denote the set of integers congruent
to ±1 modulo 9, and V consist of those ≡ ±2, ±4 (mod 9).
Proposition 13.1. Suppose that a and b are cubefree coprime odd integers and let
F (x, y) = ax3 + by 3 . Define the family of associated elliptic curves Ex,y as in (20).
(t)
(t)
Then, if x and y are coprime integers, the conductor N (Ex,y ) of Ex,y satisfies, for
t equal to one of ±1, ±2, ±3, ±6,
Y
Y
(t)
N (Ex,y
) = 2α · 3β
p2
p,
p|ab
p|ax3 +by 3
p - 6ab
where
α
0
1
2
3
5
Conditions
if ν2 (ax3 + by 3 ) = 4
if ν2 (ax3 + by 3 ) ≥ 5
if ν2 (xy) ≥ 2
if ν2 (xy) = 1 or ν2 (ax3 + by 3 ) ∈ {2, 3}
if ν2 (ax3 + by 3 ) = 1
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
43
and
β
1
2
3
Conditions
if 3 | ax + by and either a, b, ab ∈ V or a, b ∈ U,
if 3 - ax3 + by 3 and either a, b, ab ∈ V or a, b ∈ U,
if either a, b ∈ U, or a, b ∈ V, ab ∈ U, or 3 | ab.
3
3
Proof. The computations of νp (N (Ex,y )) is straightforward for p > 3, but requires
some more work (and twisting to minimize) for p = 2 and 3. We use Tate’s
algorithm; in practice we appeal to Papadopolous [36] for most (but not all) cases.
As an example of the kind of result one may obtain from Theorem 8.4 in this
situation, we have the following.
Theorem 13.2. Let a and b be odd integers such that ax3 + by 3 is irreducible in
Q[x, y] and suppose that all solutions to the equation
Y
(64)
ax3 + by 3 =
pαp
p | 6ab
in coprime nonzero integers x and y, and nonnegative integers αp , satisfy α2 ∈
{1, 4}. Then there exists an effectively computable constant l0 such that the equation
ax3 + by 3 = z l
(65)
has no solution in nonzero integers x, y and z (with gcd(x, y) = 1), and prime l ≡ 1
(mod 3) with l ≥ l0 . If all solutions to (64) with x and y coprime integers have
α2 ≤ 4, then there exists an effectively computable constant l1 such that equation
(65) has no solutions in odd integers x, y and prime l ≥ l1 .
Proof. Suppose that we have ax30 +by03 = z0l where x0 , y0 and z0 are nonzero integers
with gcd(x0 , y0 ) = 1, and l is prime. Then, via Theorem 8.4, we find that either
l is bounded as in (41), or we deduce the existence of an integer solution (x1 , y1 )
(t)
to equation (64), for which we have, writing Ei for Exi ,yi (with t as in Proposition
13.1),
ν2 (N (E1 )) = ν2 (N (E0 )).
By Proposition 13.1, it follows that
ν2 (N (E0 )) ∈ {1, 2, 3}
(and that ν2 (N (E0 )) = 1, if x0 y0 is odd). From our assumptions about solutions
to equation (64), we thus have x1 y1√= 0, whereby E1 has j-invariant 0 and hence
complex
Z[(1 + −3)/2]. If l ≡ 1 (mod 3), then l splits in
√ multiplication by
E0
1
Z[(1+ −3)/2], and so ρE
(G
Q ) and hence ρl (GQ ) are contained in the normalizer
l
of a split Cartan subgroup of GL2 (Fl ). By work of Momose [33] (and since we may
suppose l > 13), we conclude that
jE0 =
28 · 33 ab(x0 y0 )3
28 · 33 ab(x0 y0 )3
=
∈ Z[1/2].
(ax30 + by03 )2
z02l
In fact, Merel [32] showed that jE0 ∈ Z, but the weaker result is adequate for our
purposes. Indeed, since we suppose x0 , y0 , z0 to be nonzero (with gcd(x0 , y0 ) = 1),
we reach the desired contradiction, provided l is suitably large in terms of a and
b, unless z0 = ±1 (which itself contradicts our assumptions upon possible solutions
to (64)).
44
MICHAEL A. BENNETT AND SANDER R. DAHMEN
0
Remark 13.3. In case l ≡ −1 (mod 3) in the preceding proof, the image of ρE
is
l
contained in the normalizer of a non-split Cartan subgroup of GL2 (Fl ). A conjecture
of Serre (stated as a question in Section 4.3 of [43]) then implies, for suitably large l,
that E0 has complex multiplication. Assuming this conjecture (or something rather
weaker, such as integrality of the corresponding j-invariant jE0 ), the restriction to
l ≡ 1 (mod 3) in Theorem 13.2 can be removed.
As a rough indication of the applicability of Theorem 13.2, we note that there
are precisely three pairs of coprime odd positive integers (a, b) with ab < 100 and
a < b, for which all solutions to equation (64)) in coprime nonzero integers x and
y, and nonnegative integers αp , satisfy α2 ∈ {1, 4}:
(a, b) ∈ {(1, 57), (1, 83), (3, 19)} .
In addition to these, if we require only α2 ≤ 4, we find the following pairs (a, b) :
(1, 15), (1, 23), (1, 43), (1, 47), (1, 51), (1, 99), (3, 11), (3, 13), (3, 25), (5, 9).
It is a simple matter to find diagonal cubic forms ax3 + by 3 for which the corresponding Thue-Mahler equations immediately reduce to Thue equations, in a
similar fashion as that of our cubic family in Section 9. For example, if we take a
and b to be cubefree nonzero coprime integers with ab even and ab2 6≡ ±1 (mod 9),
then if x and y are coprime integers satisfying (64), it follows that ax3 + by 3 = 3δ k,
where either δ ∈ {0, 1} and k | ab (more specifically, νp (k) ∈ {νp (ab), 0} for each
p | ab), if ab is coprime to 3, or δ = 0 and k | ab, otherwise. We thus reduce (64) to
solving the family of Thue equations a0p
x 3 + b0 y 3 = √
c, where a0 , b0 range over all
3
3
positive cubefree integers for which Q( a0 b20 ) = Q( ab2 ) and c = 1 or 3 (where
a0 b0 is coprime to 3 in the latter case). From work of Stender [50], the number of
such equations with a solution in nonzero integers is, with a pair of exceptions, at
most one. Specifically, combining Satz 1 and Satz 2 of [50], we have
Theorem 13.4. (Stender) Let D > 1 be a cubefree integer. Then, if a and b are
cubefree positive integers, there is at most
one equation
of the shape ax3 + by 3 = c
p
√
3
3
with c ∈ {1, 3}, gcd(ab, c) = 1 and Q( a/b) = Q( D), with as many as a single
solution in nonzero integers x and y, unless D = 2 (or 4), or D = 20 (or 50). For
these values of D, we have solutions corresponding to
(a, b, c, x, y) = (2, 1, 1, 1, −1), (2, 1, 3, 1, 1), (2, 1, 3, 4, −5), (4, 1, 3, 1, −1)
and
(a, b, c, x, y) = (20, 1, 1, 7, −19), (5, 2, 3, 1, −1),
respectively. If, for a given D, we have a solution to such an equation ax3 + by 3 = c
with xy 6= 0, then either
(i) min{a, b} = 1, c =√1, and (a, b, c) 6∈ {(19, 1, 1), (20, 1, 1), (28,√1, 1)}, in which
√
3
case η = x 3 a + y 3 b is the fundamental unit in the field Q( √ab2 ), or
√
3
(ii) (a, b, c) ∈ {(19, 1, 1), (20, 1, 1), (28, 1, 1)},
where η = x a + y 3 b is the square
√
3
of the fundamental unit in the field Q( ab2 ), or
√
√ 3
(iii) either min{a, b} > 1, or min{a, b} = 1 and c = 3, and η = 1c x 3 a + y 3 b
√
3
is the fundamental unit in Q( ab2 ), or its square (with the sole exception of
(a, b, c) = (2, 1, 3)).
Arguing as in the proof of Theorem 13.2 (only without appeal to Proposition
13.1), we thus have
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
45
Theorem 13.5. Let D be an even positive cubefree
√ integer with D 6≡ ±1 (mod 9)
and let 0 < < 1 be the fundamental unit in Q( 3 D). Suppose that is not of the
√
√
√ 3
√
shape x 3 a + y 3 b and that neither nor 2 is of the shape 1c x 3 a + y 3 b , with
c ∈ {1, 3}, x √
and y nonzero
√ integers, and a and b coprime positive cubefree integers
3
for which Q( ab2 ) = Q( 3 D). Then there exists an effectively computable constant
l0 , depending only on D, such that, for each such pair a and b, the equation
ax3 + by 3 = z l
has no solutions in nonzero integers x, y and z, with gcd(x, y) = 1, and prime
l ≡ 1 (mod 3) with l ≥ l0 .
A short computation with Pari GP indicates that this theorem is applicable to
the following D < 150:
D ∈ {34, 38, 74, 78, 84, 86, 92, 94, 102, 106, 114, 132, 138, 142, 146} .
We are unaware
√ of simple criteria for determining whether, given D, the fundamental unit of Q( 3 D) takes a “binomial” form as in the hypotheses of Theorem 13.5.
Our calculations indicate that this occurs infrequently (and hence that this theorem
is applicable to “most” cubefree D satisfying the given congruences modulo 18).
14. Examples and computations
Given a Klein form F , we know essentially three distinct ways to determine, in an
effective manner, whether the corresponding Thue-Mahler equations have solutions
or not. The first is to solve the equations, as in Tzanakis and de Weger [53], through
a combination of lower bounds in linear forms in complex and p-adic logarithms,
with techniques from computational Diophantine approximation. A second method
is to appeal to the syzygy (12), which shifts the problem to one of determining the
S-integral points on a collection of (Mordell) elliptic curves. Since the number of
such curves is exponential in the cardinality of SF , in many situations this method
appears impractical.
A third approach is to actually compute models for all elliptic curves E/Q at
the corresponding levels, say via modular symbols, and to check whether or not
there exists one with E[n] isomorphic to that coming from the Klein form. One has
the feeling that, in all but the simplest cases, this last approach is the least computationally efficient (though we are unaware of a complexity analysis to confirm
this).
In the remainder of this section, we will illustrate the first and third of these
methods, providing the results of somewhat extensive computations.
14.1. Solving Thue-Mahler equations. In the case of cubic forms, both as a
“reality check” on our heuristics, and to illustrate the utility of these methods, we
implemented the algorithm for solving Thue-Mahler equations detailed in Tzanakis
and de Weger [53]. After computing (Hermite) reduced representatives for each
class of irreducible, primitive cubic forms with |∆F | ≤ 106 , in each case we solved
the corresponding equation (3). We summarize our results in the following table.
More details, including lists of forms satisfying the hypotheses of Theorem 1.1, are
available from the authors on request. Let us define, as before, H3 (A, B) to be the
number of GL2 (Z)-equivalence classes of primitive irreducible binary cubic forms
F ∈ Z[x, y] with discriminant A < ∆F ≤ B and let H3∗ (A, B) to be the analogous
46
MICHAEL A. BENNETT AND SANDER R. DAHMEN
counting function for those forms for which equation (3) has no integral solutions.
Then we have
X
102
103
104
105
106
H3 (0, X) H3∗ (0, X) H3 (−X, 0)
2
0
7
30
0
160
484
4
2201
6765
69
26875
83636
2174
303136
H3∗ (−X, 0)
0
0
20
741
19311
Recall that our heuristic predicts
H3∗ (−X, 0)
H3∗ (0, X)
= lim
= 1.
X→∞ H3 (−X, 0)
X→∞ H3 (0, X)
lim
Dan Shanks once noted that log log log x tends to infinity “with great dignity.”
(Math. Comp. 13 (1959), page 272); we believe something similar to be occurring
here.
14.2. Studying elliptic curves at candidate levels. Given a Klein form F , if
we have access to a full set of isomorphism classes of elliptic curves E/Q at the
various levels N that can arise from possible solutions to equation (2), we can, in
principle, do without most of the theory we have developed so far. Indeed, we may
simply appeal to the congruences from Proposition 8.1 to eliminate the possibility of
a given elliptic curve giving rise to our newform f , for large enough prime exponent
l. It is possible, however, to carry out this check in a more elegant fashion, with
no need for explicit calculation of Fourier coefficients for the various elliptic curves
(Frey-Hellegouarch and otherwise) encountered.
Let F be an irreducible Klein form of index n (with δn nonsquare, in case n = 4)
and, as previously, denote by j(x, y) the j-invariant associated to Ex,y . By a direct
application of Theorem 8.4, for an elliptic curve E at a candidate level N , with
j-invariant j0 , we need only check that j0 is not of the form j(x, y) for (coprime)
integers x and y and, if n = 3 or 5, that j00 is not of the form j(x, y). Here
j00 = 17282 /j0 , if n = 3, and j00 = J(j0 ) (as in (31)), if n = 5. If j00 = ∞,
this value can be ignored, since then j00 = j(x, y) corresponds to a root of F ,
which is irreducible by assumption. Note that solving the equation j0 = j(x, y)
(or j00 = j(x, y)) amounts to finding rational roots of a univariate polynomial over
the rationals. If j0 = 0 or j0 = 1728, and this check fails to eliminate E, then it
could still happen that KnE is reducible or that the fields determined by F (x, 1)
and KnE (x, 1) are not isomorphic. If this is the case, one can still eliminate E.
Note that what we are doing here amounts to considering only the values of
the Fourier coefficients ap of a candidate curve E, up to sign. Of course, there
may occur instances where such an approach fails to discard an elliptic curve E
which we can actually eliminate through careful examination of the ap (taking
their signs into account). In practice, it appears that such a situation does not
arise particularly often, especially if care is taken to identify the levels N which can
genuinely correspond to solutions of our superelliptic equations.
We already know that our Frey-Hellegouarch curve attached to (2), after level
lowering, corresponds to a modular form at some level in Z∗SF . In fact, we can
typically restrict the possible levels occuring somewhat further. Although tedious
in many cases, especially when in comes down to determining possible ν2 (N ) and
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
47
ν3 (N ), this is a straightforward task. For more information we refer to the appendix.
The remainder of this section contains a number of examples, for each index n ∈
{2, 3, 4, 5}, where we explicitly determine “minimal” such N .
14.2.1. Cubic forms. We list reduced forms F (x, y) = (α1 , α2 , α3 , α4 )3 with |∆F | ≤
104 and the property that equation (3) has no integral solutions. Our first table
contains forms of negative discriminant:
α0
3
5
3
3
3
3
3
5
6
3
α1
2
6
8
4
7
1
5
1
8
1
α2
5
7
9
9
10
7
6
4
10
8
α3
3
3
7
5
9
3
7
3
3
3
∆F
−2063
−2423
−2591
−5087
−19 · 269
−22 · 1283
−13 · 443
−6271
−22 · 31 · 53
−6983
α0
3
3
3
3
3
3
3
3
3
3
α1
1
4
4
7
7
5
4
1
2
5
α2
6
2
10
8
12
8
3
8
8
3
α3
5
6
6
9
11
9
7
−3
6
7
∆F
−79 · 89
−22 · 1931
−22 · 1931
−7823
−17 · 487
−37 · 251
−9343
−9551
−22 · 2411
−22 · 2459
The corresponding levelsQwe need to consider for these examples are, after
suitable twisting, N = 2β2 p|∆F p, where the product is over odd prime p and
β2 ∈ {0, 5} or {2, 3}, if 2 divides, or fails to divide ∆F , respectively.
If we consider forms of positive discriminant, there are precisely four classes with
∆F < 104 :
α0
3
3
3
3
α1
2
1
4
−1
α2
−7
−8
−7
−10
α3
−3
−3
−3
−3
∆F
672
732
8017
72 · 132
Level(s)
2δ · 672 , δ ∈ {2, 3}
2δ · 732 , δ ∈ {2, 3}
2δ · 8017, δ ∈ {2, 3}
δ
2 · 72 · 132 , δ ∈ {2, 3}
All the levels involved in these 24 examples are smaller than 200000, which means
that all elliptic curves at these levels (i.e. with these conductors) are available in
Cremona’s database. For each of these forms, therefore, there was no need to solve
the corresponding Thue-Mahler equations. Instead, one could simply have checked
that the equation j0 = j(x, y) has no rational roots, for each j-invariant j0 of an
elliptic curve at the given levels.
14.2.2. Quartic forms. Binary forms of degree 4 have, as noted in Section 2, exactly
two independent invariants, traditionally denoted I and J – in our notation, I =
τ4 (F ) = 0 and J = 27 δ3 , where ∆F = −27 δ32 . In Cremona [6], one finds an
algorithm (based on Julia’s theory of reduction) for finding an SL2 (Z)-reduced
representative for each class of quartic forms with given (I, J). Implementing this
in the case I = 0, we may list quartic Klein forms with |∆F | below a given bound.
We assume, replacing F by −F if need be, that δ3 > 0. Applying Proposition
14 of [6], we may thus conclude that there exists a (not necessarily Julia-reduced)
representative (α0 , α1 , α2 , α3 )4 for each form with invariants I = 0 and J = 27 δ3 ,
satisfying
√
3 1/3
1 1/3
−
δ
≤ α0 ≤ √ δ3 , −2|α0 | < α1 ≤ 2|α0 |,
2 3
3
48
MICHAEL A. BENNETT AND SANDER R. DAHMEN
and
H1 ≤ 8α0 α2 − 3α12 ≤ H2 ,
where
H1 =
1/3
6 δ3
max −2(α0 +
and
H2 =
1/3
6 δ3
1/3
δ3 ), α0
min −2 α0 , α0 +
q
q
2/3
2
− 4δ3 − 3α0
2/3
4δ3
−
3α02
.
In the following, we list (Julia) reduced representatives F = (α0 , α1 , α2 , α3 )4 for
classes of Klein forms with |δ3 | ≤ 4000, and for which the corresponding ThueMahler equation has no small solutions (i.e. solutions with, say, |x|, |y| ≤ 100).
We suspect that the superelliptic equations corresponding to these forms are, for
suitably large exponents, insoluble.
α0
6
4
2
6
4
2
6
6
2
α1
α2
13
−9
1
−18
−13 −9
3
−21
5
3
−5
24
−3
9
15
−3
7
−12
α3
−9
28
−7
−15
−25
−8
−25
13
24
|δ3 |
3 · 599
1907
2053
2411
2683
2789
3 · 1103
3391
3391
α0
5
2
7
5
7
4
2
2
α1
−8
1
−13
16
−11
7
7
3
α2
α3
|δ3 |
−18 19
3491
−21 51
3517
−24 −4
3581
−12 −7
3659
−18
8
3 · 1237
6
−28
3739
−18 28
3907
21 −15
3923
To prove our suspicions correct (without actually solving the Thue-Mahler equations in the traditional manner), we observe that the corresponding levels we are
required to consider for these examples are, after suitable twisting,
N = 3β · rad3 (δ3 ), β ∈ {2, 3},
for those forms with δ3 coprime to 3, and
N = 3β · rad3 (δ3 ), β ∈ {1, 4},
otherwise. Here, we denote by rad3 (δ3 ) the product of primes 6= 3 dividing δ3 . We
note that for some of these examples where ν3 (N ) = 2, there exists a quadratic
twist with ν3 (N ) = 1, but for such a form, we always still need to consider the case
ν3 (N ) = 2.
Again, for each of these forms, data for all corresponding elliptic curves are
available. It is again an easy matter to check that for all these quartic forms F
and all j-invariants j0 of the elliptic curves at the levels corresponding to F , we
have that the associated equations j0 = j(x, y) and 17282 /j0 = j(x, y) both have
no rational roots.
We conclude this subsection with a number of illustrative examples of quartic
Klein forms.
Example 14.1. Consider the Klein form F = (2, 55, 429, 85)4 , which has δ3 = 1632
and corresponding levels N = 3β · 163, β ∈ {2, 3}. The equations j0 = j(x, y) have
no solutions, but the equations 17282 /j0 = j(x, y) do have a solution in one case,
namely for the elliptic curve
E : y 2 + y = x3 − 18x − 34,
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
49
which has conductor 32 ·163 and j-invariant j0 = −884736/163. We have 17282 /j0 =
j(−85, 8). This is also illustrated by the fact that the 3-division polynomial of E,
given by 3x4 − 108x2 − 405x − 324, defines the same field as
F (x, 1) = 2x4 + 55x3 + 429x2 + 85x − 7084.
On further inspection, one can actually discard this elliptic curve E by using
√ an
image of inertia argument (this is basically possible because its twist over −3
has conductor 163, but a similar twist of the Frey-Hellegouarch curve still has
potentially good additive reduction at 3).
Our final two examples of this subsection are chosen to illustrate the necessity
of considering the companion form H and of the inclusion of the prime 2 in S̃F . In
both cases, one may check that straightforward image of inertia arguments cannot
be used to eliminate the elliptic curves in question.
Example 14.2. Consider F = (2, 1, −18, 4)4 with δ3 = 1637. We can show that
the equation j0 = j(x, y) is insoluble, for each j-invariant j0 of an elliptic curve E/Q
of conductor 32 · 1637 and 33 · 1637. On the other hand, we have H(23, 27) = 16372 ,
corresponding to an E/Q of conductor N = 33 · 1637 (the conductor of E1,0 is
2 · 33 · 1637).
Example 14.3. Consider F = (5, 4, −120, 85)4 with δ3 = 43 · 1012 . Again, the
equation j0 = j(x, y) has no solutions. We have H(7, −6) = 24 · 3 · 432 · 101,
corresponding to an E/Q of conductor N = 33 · 43 · 1012 . Here, the conductor of
E1,0 is precisely 5N .
14.2.3. Sextic forms. For sextic forms (and the same remark applies to forms of
higher degree), there are a number of viable ways to identify distinguished forms
in a given GL2 (Z)-equivalence class (see e.g. Cremona and Stoll [51], and Edwards
[13]). These reduction theories are quite involved, however, and we will content
ourselves with a simplistic search for examples, by considering only forms with
small naive height (i.e. max |αi |). Such a search reveals a number of examples
which potentially satisfy the hypotheses of Corollary 8.5 (in that the corresponding
Thue-Mahler equations have no small solutions). We have not actually attempted
to run the algorithm of Tzanakis and de Weger for such forms; the dependence
upon the degree of the form (or more specifically, upon the number of fundamental
units in the field corresponding to F (x, 1)) is a severe one.
Remark 14.4. Our search revealed many sextics which could be obtained from
a sextic in the list below by a matrix transformation over Z (not necessarily with
unit determinant). We also found cases where the matrix transformation could
not be defined over Z, but the resulting fields are isomorphic. In our list, we have
suppressed such examples.
50
MICHAEL A. BENNETT AND SANDER R. DAHMEN
Form
(3, 2, 25, −40)6
(3, 8, 20, 50)6 ,
(3, 4, −105, −540)6
(5, 4, −40, −140)6
(3, 4, 20, 10)6 ,
(3, 1, −10, −80)6
(5, 9, 5, 40)6
(5, 2, −70, 180)6
(3, 8, 20, 140)6
(3, 5, −140, 530)6
(7, 6, 20, 50)6
(7, 6, −15, 120)6
(3, 4, −25, 10)6
(3, 2, −20, 50)6
|δ4 |
22 · 331
227
22 · 239
22 · 491
251
22 · 397
419
22 · 439
22 · 907
22 · 461
523
22 · 33 · 41
557
571
Level(s)
22 · 331
δ
2 · 227, δ ∈ {2, 3}
23 · 239
22 · 491
δ
2 · 251, δ ∈ {2, 3}
2δ · 397, δ ∈ {1, 3}
23 · 419
22 · 439
22 · 907
δ
2 · 461, δ ∈ {1, 3}
2δ · 523, δ ∈ {2, 3}
22 · 33 · 41
δ
2 · 557, δ ∈ {2, 3}
2δ · 571, δ ∈ {2, 3}
In all these cases, the levels we encounter are sufficiently small that data for all
elliptic curves are available. It is again an easy matter to check that for each of these
sextic forms F and all j-invariants j0 of the elliptic curves at the levels corresponding
to F , we have that the associated equation j0 = j(x, y) has no rational roots. As
before, we conclude that the corresponding superelliptic equations have no solutions
for suitably large prime exponents.
Remark 14.5. As noted previously, squares of sextic Klein forms arise as covers
of cubic Klein forms. By way of example, the first sextic form in the above table
F (x, y) = 3x6 + 2x5 y + 25x4 y 2 − 40x3 y 3 − 55x2 y 4 + 6xy 5 − 29y 6
satisfies (3F (x, y))2 = G(p(x, y), q(x, y)) with
G(u, v)
=
3u3 + u2 v + 5uv 2 − 2v 3
p(x, y)
=
3x4 − 10x2 y 2 + 16xy 3 + 11y 4
q(x, y)
=
4y(3x3 + x2 y + 5xy 2 − 2y 3 ).
The cubic form G has discriminant −32 · 331, while the discriminant of F is not
divisible by 3. In fact, there are no cubic Klein forms such that one of the corresponding level is 22 · 331. Similar remarks hold for the other sextic forms in the
table.
14.2.4. Duodecic forms. A routine search, similar to that for sextic forms, reveals
some examples of Klein forms of index 5 and moderate corresponding levels, for
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
51
which the Thue-Mahler equations have no small solutions.
Form
(2, 3, −385, 4125)12
(2, 1, −165, 1375)12
(2, 3, 66, −1100)12
(2, 5, −473, 5115)12
(2, 3, −264, 1760)12
(2, 5, −308, 2640)12
(2, 1, −198, −1540)12
(2, 3, 143, −1155)12
(2, 1, −88, 1760)12
(2, 1, −66, 220)12
(2, 3, −66, −1100)12
|δ5 |
Level(s)
53 · 61
5 · 61
52 · 101
5 · 101
33 · 5 · 72
33 · 5 · 7
5 · 331
5 · 331
33 · 52 · 13
33 · 5 · 13
5 · 379
5 · 379
52 · 89
52 · 89
3
5 · 97
52 · 97
3
5 · 107
52 · 107
δ
163
5 · 163, δ ∈ {0, 1}
35 · 17
35 · 17
In each case, all corresponding elliptic curve data are available. Again, one checks
that, for each of these duodecic forms F and all j-invariants j0 of the elliptic
curves at the levels corresponding to F , the associated equations j0 = j(x, y) and
J(j0 ) = j(x, y) have no rational roots.
14.3. Small bounds for the exponents. If not only all elliptic curves but actually all newforms at the appropriate levels are known, one can compute explicit
bounds for the prime exponent l which, in practice, turn out to be much smaller
than that provided by our main theorem. Again, there is no need to invoke all
the machinery we have constructed – we can simply apply standard congruences
provided by the modular method. In this subsection, we highlight some examples
of forms from Section 14.2 for which we are able to compute all newforms at the
appropriate levels.
In order to apply the modular machinery, we require the mod l Galois representation in question to be irreducible. Thanks to Mazur [31], we know that this is
automatically the case if l > 163. In fact, assuming only l > 13, we obtain a like result, provided only that we exclude finitely many possibilities for the corresponding
j-values. As is well-known, these are given by j in the following set :
{−17 · 3733 /217 , −172 · 1013 /2, −215 · 33 , −7 · 113 , −7 · 1373 · 20833 ,
−218 · 33 · 53 , −215 · 33 · 53 · 113 , −218 · 33 · 53 · 233 · 293 }.
It is straightforward to check if the j-invariant of a given Frey-Hellegouarch curve
can be equal to one of these values. In each of the examples we consider here, this
is not the case.
In what follows, we tabulate various Klein forms, together with a set S of prime
exponents l > 13 for which we have not ruled out the possibilities of (2) having a
solution using standard congruences coming from the modular method (i.e. (37)
and (38) for small primes p). For l > 13, l 6∈ S, our conclusion is, in each case,
that (2) has no (nonzero) integer solutions.
52
MICHAEL A. BENNETT AND SANDER R. DAHMEN
Form
(3, 2, −7, −3)3
(3, 4, 2, 6)3
(3, 4, 10, 6)3
(2, 51, 369, 73)4
(3, 2, 25, −40)6
(5, 4, −40, −140)6
(5, 9, 5, 40)6
(2, 3, 66, −1100)12
(2, 3, −264, 1760)12
(2, 3, −66, −1100)12
S
{23, 67}
{19}
∅
{19, 37}
∅
∅
∅
∅
∅
{17}
With work, we can likely eliminate possible solutions corresponding to (some)
primes l ≤ 13 or l ∈ S. Our aim here, however, is merely to demonstrate that
the bounds on l provided by our main theorem are, in specific applications, overly
pessimistic.
15. Ackowledgments
The authors would like to thank Ryotaro Okazaki, Peter Olver and the anonymous referee for numerous valuable suggestions.
Appendix A. Conductor calculations
A.1. Quartic Klein forms at p = 2. To find minimal (integral) models for our
Frey-Hellegouarch curves in case F is a Klein form of index 3 with ∆F odd, we
consider quadratic twists of (20) over Q((−1)δ/2 ) (where δ is as defined in (27)),
translated as follows (we write H for H(x, y) and G for G(x, y) for concision):
Ex,y : Y 2 + Y = X 3 +
(66)
3H
(−1)δ G − 16
X+
,
16
64
if H(x, y) is even,
(67)
Ex,y : Y 2 + XY + Y = X 3 − X 2 +
3H − 5
(−1)δ G − 3H − 17
X+
,
16
64
if H(x, y) ≡ 7 (mod 16), and
(68)
Ex,y : Y 2 + XY = X 3 − X 2 +
3H + 3
(−1)δ G − 3H − 1
X+
,
16
64
if H(x, y) ≡ 15 (mod 16).
Corresponding quantities associated to the models Ex,y in (66) – (68) are
∆(x, y)
=
33 δ3 F (x, y)3
c4 (x, y)
=
−32 H(x, y)
c6 (x, y)
=
j(x, y)
(−1)δ+1 33 G(x, y)/2
−33 H(x, y)3
=
.
δ3 F (x, y)3
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
53
It is by no means obvious that these curves have integral coefficients. To see
that they do, note that
H(x, y) ≡ (8α0 α2 − α12 )x4 + (8α0 α3 + 4α1 α2 )x3 y + (2α1 α3 + 4α22 )x2 y 2
+(8α1 α4 + 4α2 α3 )xy 3 + (8α2 α4 − α32 )y 4 (mod 16).
If α1 is even, then, from the assumption that δ3 is odd, necessarily α0 and α3 are
odd, whereby from (7), 4 | α1 and 2 | α2 . It follows that
H(x, y) ≡ 8x3 y + 2α1 α3 x2 y 2 + 4α2 α3 xy 3 − α32 y 4 (mod 16),
and hence either H(x, y) ≡ −1 (mod 8) or H(x, y) ≡ 0 (mod 16). Via symmetry,
we reach a like conclusion if α3 is even. On the other hand, if α1 and α3 are both
odd, then (7) implies that α1 α3 ≡ −1 (mod 4) and α2 ≡ 1 (mod 2), whereby, from
(28), precisely one of α0 , α4 , say α0 , is even. We thus have that either H(x, y) ≡
−1 (mod 8) or that H(x, y) is even whereby xy is necessarily odd, and
H(x, y) ≡ −(α1 − α3 )2 + 4 ≡ 0 (mod 16).
If H(x, y) is even, then necessarily, from Proposition 2.1, F (x, y) is odd, whereby,
from (12), G(x, y) ≡ 16 (mod 32). We claim that (−1)δ G(x, y) ≡ 16 (mod 64).
If α2 is odd, then (7) implies that α1 α3 ≡ −1 (mod 4) and so, from (28), α0
and α4 have opposite parity. Since F (x, y) is odd, it follows that xy is even,
contradicting the fact that
2
(69)
H(x, y) ≡ − α1 x2 + α3 y 2 ≡ 0 (mod 4).
We may thus assume that α2 is even and hence, from (7), (28) and (69), that either
4 | α1 , 2 | y and 2 - α0 , or 4 | α3 , 2 | x and 2 - α4 . In the first case,
G(x, y) ≡ 16 α3 (mod 64),
while, in the second,
G(x, y) ≡ −16 α1 (mod 64).
δ
In each case, (−1) G(x, y) ≡ 16 (mod 64).
Next, suppose that H(x, y) is odd (so that H(x, y) ≡ −1 (mod 8)). From (7), it
is straightforward to check that
(−1)δ G − 3H ≡ 1 or 17 (mod 64),
for H ≡ −1 or 7 (mod 16), respectively (where we note that the choice of δ ensures
that (−1)δ G ≡ −2 (mod 8)).
A.2. Conductor calculations at p k δn . In many cases where a prime p k δn , for
a given Klein form F , we are able to be precise about the reduction at p of our
Pk
Frey-Hellegouarch curves Ex0 ,y0 . Let, as usual, F (x, y) = i=0 αi xk−i y i denote a
Klein form of index n.
Lemma A.1. Let p be a prime with p - α0 and suppose p k δn . Furthermore, assume
that p 6= 2 (if n = 4), and that p > 5 (if n = 5). Then
F ≡ α0 (x − ay)k−1 (x − by)
(mod p)
for some a, b ∈ Z with a 6≡ b (mod p). In particular, for the Hessian of F , we have
H ≡ −α02 (a − b)2 (x − ay)2k−4
(mod p).
54
MICHAEL A. BENNETT AND SANDER R. DAHMEN
Proof. The formula for H follows immediately from that for F . Since p | δn (and
hence ∆F ), F necessarily has a repeated factor modulo p. If no such factor is
linear, this places severe restrictions upon the coefficients of F (modulo p). Using
the relations (7), (8) and (9), we see that this is in fact impossible, so F must have a
double root over Fp . After a linear translation, we can assume that αk ≡ αk−1 ≡ 0
(mod p). Again appealing to (7), (8) and (9), we find that also αk−2 ≡ . . . ≡ α2 ≡ 0
(mod p). It remains, then, to show that p - α1 . Assume to the contrary that p | α1 .
In case the index n = 2 or 3, the formulae for δn immediately imply that p2 | δn , a
contradiction. In case the index n = 4 or 5, (8) and (9) yield p2 | a6 (if (n, p) = (4, 5)
or (n, p) = (5, 11), this is less immediate and one must take care). Except when
(n, p) = (5, 11), we may thus conclude that p2 | δn , again a contradiction. In the
remaining case, one obtains 11 | a1 , 112 | a2 , and 113 | ai , for i = 3, 4, 5, 6, which is
enough to imply that 112 | δ5 . This proves the lemma.
Proposition A.2. Let p be a prime with p - α0 n and suppose p k δn . If the index
n = 2 or 3, then the Frey-Hellegouarch curve Ex0 ,y0 associated to a solution of
equation (2) has multiplicative reduction at p.
Proof. In the model for our Frey-Hellegouarch curve we have p | ∆(x0 , y0 ), so it
suffices to prove that p - c4 (x0 , y0 ). This would follow if p - H(x0 , y0 ). If we suppose
p | H(x0 , y0 ), then Lemma A.1 implies that p | x0 − ay0 , and so p | F (x0 , y0 ). Since
l > 1, it follows that p2 | F (x0 , y0 ). Using a linear translation, we can assume,
without loss of generality, that a ≡ 0 (mod p) and so F ≡ α0 xk−1 (x − by) (mod p),
i.e. α2 ≡ . . . ≡ αk ≡ 0 (mod p). The fact that p | x0 , together with p2 | F (x0 , y0 ),
therefore implies that p2 | αk . From the expressions for δn given in Section 2, we
may conclude, in case n = 2 or 3, that p2 | δn , a contradiction.
For index n = 4 and 5, we do not obtain quite such a strong result. Indeed it
can happen that the root x0 − ay0 modulo p lifts to a root in Zp . Furthermore,
there are actually cases where (with similar assumptions to those of the lemma and
proposition above) the Frey-Hellegouarch curve actually has additive reduction at
p. Even in such instances, however, we are, in a certain sense, not too far from
multiplicative reduction.
Proposition A.3. Let p be a prime with p - α0 n and suppose p k δn , for n = 4 or 5.
Furthermore, assume that p > 3 if n = 5. Then the Frey-Hellegouarch curve Ex0 ,y0
√
associated to a solution of (2) with l > 7, or its quadratic twist over Q( ±p), has
multiplicative reduction at p.
Proof. Since p 6= 2, it suffices to prove that Ex0 ,y0 has potentially multiplicative
reduction, i.e. νp (j(x0 , y0 )) < 0. We calculate
νp (j(x0 , y0 )) = 3 νp (H(x0 , y0 )) − n νp (F (x0 , y0 )) − 1.
From Lemma A.1, if p - F (x0 , y0 ), then p - H(x0 , y0 ), in which case νp (j(x0 , y0 )) =
−1 < 0. It remains only to consider when p | F (x0 , y0 ). If n = 4 (so that k =
6), then, from Proposition 2.1, νp (H(x0 , y0 )) ≤ k(k − 1)/3=10. Together with
νp (F (x0 , y0 )) ≥ l, we obtain νp (j(x0 , y0 )) < 0. A similar argument, in case n =
5, leads to a like conclusion, at least for l ≥ 29. To treat smaller values of l,
note that there certainly exist (nondegenerate) x, y ∈ Z with p - F (x, y), so that
E
νp (j(x, y)) = −1. It follows from the theory of Tate curves that 5 | #ρ5 x,y (Ip ),
E
E
where Ip ⊂ Gal(Q/Q) denotes an inertia subgroup of p. Since ρ5 x,y ' ρ5 x0 ,y0 ,
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
55
E
we also have 5 | #ρ5 x0 ,y0 (Ip ). On the other hand, if Ex0 ,y0 has potentially good
E
reduction at p, then no primes ≥ 5 can divide #ρ5 x0 ,y0 (Ip ). This contradiction
concludes the proof.
References
[1] M. A. Bean, An isoperimetric inequality for the area of plane regions defined by binary forms,
Compositio Math. 92 (1994), 115–131.
[2] M. Bennett, J. Ellenberg and N. Ng, The Diophantine equation A4 + 2δ B 2 = C n , Int. J.
Number Theory 6 (2010), 1–27.
[3] F. Beukers, The generalized Fermat equation, notes available at
www.math.uu.nl/people/beukers/Fermatlectures.pdf
[4] C. Breuil, B. Conrad, F. Diamond and R. Taylor, On the modularity of elliptic curves
over Q: wild 3-adic exercises, J. Amer. Math. Soc. 14 (2001), 843–939.
[5] H. Carayol, Formes modulaires et représentations galoisiennes à valeurs dans un anneau local complet, Contemp. Math. 165 (1994), 213–237.
[6] J. E. Cremona, Reduction of binary cubic and quartic forms, LMS J. Comput. Math.
2 (1999), 62–92.
[7] S. Dahmen, Classical and modular methods applied to Diophantine equations, PhD
thesis, University of Utrecht, 2008. Permanently available at
igitur-archive.library.uu.nl/dissertations/2008-0820-200949/UUindex.html
[8] H. Darmon and A. Granville, On the equations z m = F (x, y) and Axp + By q = Cz r ,
Bull. London Math. Soc. 27 (1995), 513–543.
[9] H. Darmon and L. Merel, Winding quotients and some variants of Fermat’s last
theorem, J. Reine Angew. Math. 490 (1997), 81–100.
[10] H. Davenport, On the class-number of binary cubic forms. I. J. London Math. Soc.
26 (1951), 183–192; ibid, 27 (1952), 512.
[11]
, On the class-number of binary cubic forms. II. J. London Math. Soc. 26
(1951), 192–198.
[12] L.V. Dieulefait and J. Jimenez, Solving Fermat-type equations x4 + dy 2 = z p via
modular Q-curves over polyquadratic fields, J. Reine Angew. Math. 633 (2009), 183–
196.
[13] J. Edwards, Platonic Solids and Solutions to X 2 + Y 3 = dZ r , PhD thesis, University
of Utrecht, 2005. Permanently available at
igitur-archive.library.uu.nl/dissertations/2006-0208-200155/UUindex.html
[14]
, A complete solution to X 2 + Y 3 + Z 5 = 0, J. Reine Angew. Math., 571
(2004), 213–236.
[15] J. Ellenberg, Galois representations attached to Q-curves and the generalized Fermat
equation A4 + B 2 = C p , Amer. J. Math., 126 (2004), 763–787.
[16] N. D. Elkies, ABC implies Mordell, Internat. Math. Res. Notices, 7 (1991), 99–109.
[17] P. Erdős, Arithmetical properties of polynomials, J. London Math. Soc. 28 (1953),
416–425.
[18] G. Faltings, Endlichkeitssätze für abelsche Varietäten über Zahlkörpern, Invent.
Math. 73 (1983), 349–366.
[19] T. Fisher, The Hessian of a genus one curve, Proc. London Math. Soc., to appear.
[20] P. Gordan, Beweis, dass jede Covariante und Invariante einer binären Form eine ganze
Function mit numerischen Coefficienten einer endlichen Anzahl solcher Formen ist, J.
Reine Angew. Math. 69 (1868), 323–354.
, Vorlesungen über Invariantentheorie, Teubner, Leipzig, 1887.
[21]
[22] A. Granville, Smooth numbers: computational number theory and beyond, Algorithmic Number Theory, MSRI Publications, Volume 44, 2008, 267–323.
[23] D. Hilbert, Theory of Algebraic Invariants, Cambridge Mathematical Library, 1993.
56
MICHAEL A. BENNETT AND SANDER R. DAHMEN
[24] A. Hoshi and K. Miyake, A geometric framework for the subfield problem of generic
polynomials via Tschirnhausen transformation, Number theory and applications, 65–
104, Hindustan Book Agency, New Delhi, 2009.
[25] F. Klein, Lectures on the icosahedron and the solution of equations of the fifth degree,
Dover publications, New York, 1956 (translation of original 1884 edition)
[26] A. Kraus, Majorations effectives pour l’équation de Fermat généralisée, Canad. J.
Math. 49 (1997), no. 6, 1139–1161.
[27] J. C. Lagarias, H. L. Montgomery and A. M. Odlyzko, A bound for the least ideal in
the Chebotarev density theorem, Invent. Math. 54 (1979), no. 3, 271–296.
[28] G. Lettl, A, Pethő and P. Voutier, Simple families of Thue inequalties, Trans. Amer.
Math. Soc. 351 (1999), 1871–1894.
[29] W. J. Leveque, On the equation y m = f (x), Acta Arith. 9 (1964), 209–219.
[30] G. Martin, Dimensions of the space of cusp forms and newforms on Γ0 (N ) and Γ1 (N ),
J. Number Theory 112 (2005), no. 2, 298–331.
[31] B. Mazur, Rational isogenies of prime degree, Invent. Math. 44 (1978), 129–162.
[32] L. Merel, Normalizers of split Cartan subgroups and supersingular elliptic curves,
Diophantine geometry, 237–255, CRM Series, 4, Ed. Norm., Pisa, 2007.
[33] F. Momose, Rational points on the modular curves Xsplit (p), Compositio Math. 52
(1984), 115–137.
[34] L. J. Mordell, On numbers represented by binary cubic forms, Proc. London Math.
Soc. 48 (1943), 198–228.
[35] P. Morton, Characterizing cyclic cubic extensions by automorphism polynomials, J.
Number Theory 49 (1994), 183–208.
[36] I. Papadopoulos, Sur la classification de Néron des courbes elliptiques en caractéristique résiduelle 2 and 3, J. Number Theory 44 (1993), 119–152.
[37] P. Olver, Classical Invariant Theory, Cambridge University Press, 1999.
[38] K. A. Ribet, On modular representations of Gal(Q/Q) arising from modular forms,
Invent. Math. 100 (1990), 431–476.
[39]
, Report on mod l representations of Gal(Q/Q), in Motives, Proc. Symp. Pure
Math. 55:2 (1994), 639–676.
[40] K. Rubin and A. Silverberg, Families of elliptic curves with constant mod p representations, Elliptic curves, modular forms, & Fermat’s last theorem (Hong Kong, 1993),
148–161, Ser. Number Theory, I, Int. Press, Cambridge, MA, 1995.
, Mod 2 representations of elliptic curves, Proc. Amer. Math. Soc. 129 (2001),
[41]
no. 1, 53–57.
[42] A. Schinzel and R. Tijdeman, On the equation y m = P (x), Acta Arith. 31 (1976),
199–204.
[43] J.-P. Serre, Propriété galoisiennes des points d’ordre fini des courbes elliptiques, Invent. Math. 15 (1972), 259–331.
[44]
, Quelques applications du théorème de densité de Chebotarev, Publications
mathématique de l’I.H.É.S., tome 54 (1981), 123–201.
[45] T. Shintani, On Dirichlet series whose coefficients are class numbers of integral binary
cubic forms. J. Math. Soc. Japan 24 (1972), 132–188.
[46]
, On zeta-functions associated with the vector space of quadratic forms. J.
fac. Sci. Univ. Tokyo, Sec. Ia 22 (1975), 25–66.
[47] X (C. L. Siegel), The integer solutions of the equation y 2 = axn + bxn−1 + . . . + k J.
London Math. Soc. 1 (1926), 66–68.
[48] C. L. Siegel, Einige Anwendungen diophantischer Approximationen, Abh. Preuss.
Akaad. Wiss. Phys. Math. Kl. (1929), 41–69.
[49] A. Silverberg, Explicit families of elliptic curves with prescribed mod N representations, Modular forms and Fermat’s last theorem (Boston, MA, 1995), 447–461,
Springer, New York, 1997.
KLEIN FORMS AND THE GENERALIZED SUPERELLIPTIC EQUATION
57
[50] H. Stender, Lösbare Gleichungen axn − by n = c und Grundeinheiten für einige algebraische Zahlkörper vom Grade n = 3, 4, 6, J. Reine Angew. Math. 290 (1977),
24–62.
[51] M. Stoll and J.E. Cremona, On the reduction theory of binary forms, J. Reine Angew.
Math. 565 (2003), 79–99.
[52] J. Thunder, On cubic Thue inequalities and a result of Mahler, Acta Arith. 83 (1998),
31–44.
[53] N. Tzanakis and B.M.M. de Weger, How to explicitly solve a Thue-Mahler equation,
Compositio Math. 84 (1992), 223-288.
[54] I. Wakabayashi, Cubic Thue inequalities with negative discriminant, J. Number Theory 97 (2002), 222–251.
[55]
, Simple families of Thue inequalities, Ann. Sci. Math. Québec 31 (2007), 211–
232.
[56]
, Number of solutions for cubic Thue equations with automorphisms, Ramanujan J. 14 (2007), 131–154.
[57] L.C. Washington, A family of cyclic quartic fields arising from modular curves, Math.
Comp. 57 (1991), 763–775.
[58] A. Wiles, Modular elliptic curves and Fermat’s Last Theorem, Ann. Math 141 (1995),
443–551.
Download