Primes of the shape x + ny - ETH E-Collection

advertisement
Diss. ETH No. 21502
Primes of the shape x2 + ny 2
The Distribution on Average and Prime Number Races
A dissertation submitted to
ETH Zürich
for the degree of
Doctor of Sciences
Presented by
Jakob Johann Ditchen
Dipl.-Math. techn.
Universität Karlsruhe (TH)
Certificate of Advanced Study in Mathematics
University of Cambridge
Born on 10 November 1982
Citizen of Germany
Accepted on the recommendation of
Prof. Dr. Emmanuel Kowalski
Prof. Dr. Özlem Imamoglu
Prof. Dr. Philippe Michel
2013
Examiner
Co-examiner
Co-examiner
Abstract
This thesis focuses on uniformities and discrepancies in the distribution of prime
numbers represented by positive definite integral binary quadratic forms of various
discriminants.
We prove results of Bombieri–Vinogradov and Barban–Davenport–Halberstam type
on the average distribution of the primes with respect to their representability by
these forms. Our results imply that the corresponding prime number theorem holds
uniformly and with a non-trivial error term for almost all negative fundamental
discriminants in long ranges.
Moreover, we investigate a variant of “Chebyshev’s bias” between primes of the
shapes x2 + ny 2 and x2 + my 2 for certain distinct positive integers n and m.
Deutsche Zusammenfassung
Die vorliegende Dissertation befasst sich mit Gleichmäßigkeiten und Diskrepanzen
in der Verteilung von Primzahlen, die durch positiv definite, ganzzahlige binäre
quadratische Formen mit unterschiedlichen Diskriminanten darstellbar sind.
Wir beweisen Varianten des Satzes von Bombieri-Vinogradov sowie des Satzes von
Barban-Davenport-Halberstam und zeigen so, dass der betreffende Primzahlsatz,
von höchstens „wenigen“ Ausnahmen abgesehen, für negative Fundamentaldiskriminanten in langen Intervallen ein gleichmäßiges und nicht-triviales Restglied aufweist.
Des Weiteren untersuchen wir für gewisse Paare (n, m) positiver ganzer Zahlen
eine Diskrepanz zwischen den Verteilungen von Primzahlen der Form x2 + ny 2 und
solchen der Form x2 + my 2 ; dies stellt ein Gegenstück zu einer klassischen Beobachtung von Tschebyscheff bezüglich Primzahlen der Formen 4k + 1 und 4k + 3 dar, die
in den letzten Jahren intensiv untersucht wurde.
Binäre quadratische Formen, das heißt homogene Polynome der Gestalt
f (x, y) = ax2 + bxy + cy 2
(x, y ∈ Z)
mit ganzzahligen Koeffizienten a, b und c, sind neben arithmetischen Folgen die einfachsten
Polynome von denen bekannt ist, dass jedes von ihnen unendlich viele Primzahlen darzustellen
vermag, sofern dem nicht Kongruenzbeziehungen der Koeffizienten offensichtlich entgegenstehen.
Die analytische Theorie der Darstellung von Primzahlen durch fest gewählte binäre quadratische
Formen ist ähnlich gut erforscht wie jene der Primzahlen in fest gewählten arithmetischen
Folgen. Es ist hingegen nur wenig darüber bekannt, wie sich diese Eigenschaften im Durchschnitt
über mehrere binäre quadratische Formen unterschiedlicher Diskriminante verhalten oder wie
sie im Vergleich zweier verschiedener Formen von einander abweichen – andererseits existieren
zahlreiche solcher Resultate für Primzahlen in arithmetischen Folgen.
Für Primzahlen in arithmetischen Folgen wurden für die erstgenannte Art von Problemen
mittels des Großen Siebs ab den 1960er Jahren Resultate erzielt, die es in Anwendungen häufig
ermöglichen, auf den Gebrauch der verallgemeinerten Riemannschen Vermutung zu verzichten.
Die bekanntesten dieser Ergebnisse sind der Satz von Bombieri-Vinogradov sowie der Satz von
Barban-Davenport-Halberstam. Diese zeigen zum einen, dass das Restglied im Primzahlsatz für
arithmetische Folgen im Durchschnitt dem durch die Riemannsche Vermutung vorhergesagten
entspricht; dabei wird der Durchschnitt über die Moduln der arithmetischen Folgen im selben
Bereich betrachtet, in welchem die Riemannsche Vermutung nicht-triviale Ergebnisse liefert.
Zum anderen konnte gezeigt werden, dass der mittlere quadratische Fehler im Primzahlsatz
sehr klein ist, wenn sowohl über die Moduln als auch über deren Restklassen gemittelt wird;
der hier zulässige Bereich für die Moduln übersteigt dabei sogar den durch die Riemannsche
Vermutung kontrollierten Bereich.
In der vorliegenden Arbeit werden analoge Resultate für positiv definite, ganzzahlige binäre
quadratische Formen gefunden: Sei X eine große, positive Zahl. Für die Anzahl der Primzahlen
p 6 X, für welche – bei gegebener ganzer Zahl n – ganze Zahlen x und y existieren, so dass
p sich in der Form p = x2 + ny 2 schreiben lässt, zeigen wir insbesondere, dass der zugehörige
Primzahlsatz für die quadratfreien, positiven ganzen Zahlen n ≡ 1 (mod 4) unterhalb von etwa
X 1/10 gleichmäßig in n gilt – abgesehen von höchstens „wenigen“ Ausnahmen. Allgemeiner
beweisen wir konkret, dass für alle A > 0 eine Konstante B = B(A) existiert, so dass für alle
ε > 0 die Beziehung
li(X) 1/2
−A
max π(X; q, C) −
ε,A Q X(log X)
e(C)h(q)
C∈K(q)
q>−Q
X0
gilt, falls Q10+ε 6 X(log X)−B ist. Hierbei bezeichnet π(X; q, C) die Anzahl der Primzahlen
p 6 X, welche durch die quadratischen Formen der Formenklasse C aus der Formenklassengruppe K(q) zur Diskriminante q darstellbar sind, h(q) bezeichnet die Klassenzahl zu dieser
Diskriminante, li steht für das logarithmische Integral und e(C) ist eine von der Klasse abhängige Konstante; die Summe auf der linken Seite läuft über negative Fundamentaldiskriminanten
q > −Q mit q 6≡ 0 (mod 8).
Ferner zeigen wir, dass das Restglied im Primzahlsatz für positiv definite binäre quadratische Formen im quadratischen Mittel über sowohl Fundamentaldiskriminanten als auch die
zugehörigen Formenklassen in einem größeren Bereich klein ist: Für alle A > 0 existiert eine
Konstante B = B(A), so dass für alle ε > 0 die Beziehung
2
X π(X; q, C) − li(X) ε,A Q1/2 X 2 (log X)−A
e(C)h(q) q>−Q C∈K(q)
X0
gilt, falls Q3+ε 6 X(log X)−B ist.
Beide Ergebnisse erreichen nicht die Stärke der oben genannten Resultate für arithmetische
Folgen. Dies ist unter anderem auf den Umstand zurückzuführen, dass es uns lediglich gelingt
eine schwächere Version einer Ungleichung zum Großen Sieb für komplexe Klassengruppencharaktere zu finden, welche für Ergebnisse dieses Typs unentbehrlich scheint.
Während die bisher genannten Ergebnisse sich mit der Untersuchung von Uniformität in der
Verteilung von durch arithmetische Folgen respektive binäre quadratische Formen darstellbaren Primzahlen befassen, ist die Frage nach Diskrepanzen in diesen Verteilungen nicht minder
interessant. Tschebyscheff bemerkte bereits, dass die Anzahl der Primzahlen in der Folge 4k + 1
unterhalb einer gegebenen Zahl meist kleiner ist als diejenige in der Folge 4k + 3. Dem Primzahlsatz zufolge sind beide Anzahlen asymptotisch gleich, so dass die Ursachen dieser „Vorliebe“
der Primzahlen für die zweite Folge nicht offensichtlich sind. Erst in den letzten Jahren wurde
diese Diskrepanz in allgemeiner Form für arithmetische Folgen eingehend untersucht.
Wir untersuchen in dieser Arbeit einen ähnlichen Effekt, der sich für die Anzahl der Primzahlen der Form x2 + ny 2 und solche der Form x2 + my 2 unterhalb einer gegebenen Zahl
offenbart, wenn sich die zugehörigen Diskriminanten der beiden Formen in der Klassenzahl zwar
gleichen – und somit, nach dem Primzahlsatz, auch das asymptotische Verhalten der Verteilungen übereinstimmt –, sie sich in der Anzahl ihrer ungeraden Primfaktoren aber unterscheiden.
Von den großen Meistern wie Fermat, Euler, Gauß und Dirichlet wurde Primzahlen der
Form x2 + ny 2 mindestens ebenso viel Beachtung geschenkt wie Primzahlen der Form a + nk.
Und noch de la Vallée Poussin bewies in seiner Arbeit zum Primzahlsatz diesen nicht nur in der
gewöhnlichen Form und in der Form für arithmetische Folgen, sondern ebenfalls gleich in der
Form für positiv definite binäre quadratische Formen. Wiewohl Primzahlen, die durch binäre
quadratische Formen darstellbar sind, seither immer wieder prominent in Erscheinung getreten
sind – beispielsweise als wichtiger Bestandteil bestimmter Faktorisierungsalgorithmen –, haben
die Untersuchungen bezüglich Gleichmäßigkeiten und Diskrepanzen in ihrer Verteilung jedoch
bei weitem nicht mehr dieselbe Aufmerksamkeit erhalten wie die entsprechende Forschung zu
Primzahlen in arithmetischen Folgen, die sich häufig einfacher gestaltet. Die vorliegende Arbeit
möchte einen Beitrag dazu leisten, diese Lücke dereinst zu schließen.
Meinen Eltern gewidmet
Contents
Primes of the shape x2 + ny 2 :
The Distribution on Average and Prime Number Races
Preface
1
Notation
5
1 Primes represented by positive definite binary quadratic forms
7
1.1
The composition of binary quadratic forms and form classes . . . . . . . . . . . .
1.2
Algebraic methods for arithmetic objects . . . . . . . . . . . . . . . . . . . . . . . 11
1.3
The Chebotarev density theorem and conditional results . . . . . . . . . . . . . . 17
2 The average distribution of primes represented by positive definite binary
quadratic forms with varying discriminant
9
21
2.1
Mean-value results for primes in arithmetic progressions . . . . . . . . . . . . . . 22
2.2
A large sieve inequality for complex ideal class group characters . . . . . . . . . . 25
2.3
Results of Bombieri–Vinogradov type . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4
The mean square distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.5
Applications and open questions . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3 Chebyshev’s bias and prime number races for binary quadratic forms
71
3.1
Bias in the distribution of primes in arithmetic progressions . . . . . . . . . . . . 72
3.2
Primes represented by different classes of forms with a fixed discriminant . . . . 74
3.3
Prime number races for forms of the shape x2 + ny 2 . . . . . . . . . . . . . . . . 76
Bibliography
89
1
Preface
Preface
Apart from arithmetic progressions, integral binary quadratic forms – that is, homogeneous
polynomials of the shape
f (x, y) = ax2 + bxy + cy 2
(x, y ∈ Z)
with integral coefficients a, b and c – are the simplest polynomials which are known to represent
infinitely many prime numbers unless there is an obvious obstacle by means of a common prime
divisor of the coefficients. Analytic questions about prime numbers which are representable by
any fixed binary quadratic form have been studied almost as extensively as analytic questions
about prime numbers in arithmetic progressions. There is, however, not much known about
the average behaviour of these representation properties when averaged over binary quadratic
forms of distinct discriminants or about differences in these properties between two distinct
binary quadratic forms. In contrast, there exist plenty of comparable results for prime numbers
in arithmetic progressions.
Results for questions on “uniformity on average” in the prime number theorem for arithmetic
progressions of various moduli have been achieved by means of the Large Sieve since the 1960s.
In applications they often allow to dispense with the assumption of the Generalized Riemann
Hypothesis (GRH). The Bombieri–Vinogradov theorem and the Barban–Davenport–Halberstam
theorem are the most famous and important of these results: The first one shows that the error
term in the prime number theorem for arithmetic progressions is small – as small as predicted
by GRH – for all reduced residue classes, “on average” over moduli in about the same range
of moduli in which GRH yields non-trivial results. The second theorem shows that the mean
square of the error term is small if one averages over both moduli and their reduced residue
classes; here the admissible range for the moduli even exceeds the range that may be controlled
by GRH.
In this dissertation we find analogous results for positive definite integral binary quadratic
forms. For all large positive numbers X, we show that the prime number theorem for primes
p 6 X of the shape p = x2 + ny 2 holds, with at most “few” exceptions, for almost all squarefree
positive integers n ≡ 1 (mod 4) up to about X 1/10 with a uniform, small remainder term. In
fact, we prove more generally that for all A > 0 and all ε > 0, there exists a constant B = B(A)
such that
X0
li(X) 1/2
−A
max π(X; q, C) −
ε,A Q X(log X)
e(C)h(q)
C∈K(q)
q>−Q
if Q10+ε 6 X(log X)−B . Here π(X; q, C) denotes the number of primes p 6 X which are
representable by quadratic forms lying in the form class C of the form class group K(q) of
discriminant q; the corresponding class number is h(q), the logarithmic integral is denoted by
li, and e(C) is a constant which depends on the form class only; the sum on the left-hand side
is over negative fundamental discriminants q > −Q with q 6≡ 0 (mod 8).
Furthermore, we show that the mean square of the remainder term in the prime number
theorem for positive definite binary quadratic forms is small in a longer range – if we average
over both the discriminants and the corresponding form classes: For all A > 0 and all ε > 0,
there exists a constant B = B(A) such that the bound
2
X π(X; q, C) − li(X) ε,A Q1/2 X 2 (log X)−A
e(C)h(q) q>−Q C∈K(q)
X0
holds if Q3+ε 6 X(log X)−B .
2
Preface
Both statements do not reach the strength of the aforementioned theorems for arithmetic
progressions. Among other reasons, this paucity is due to the fact that we only succeed to find a
weaker version of a large sieve inequality for complex ideal class group characters, which seems
to be essential for results of this type.
The results that we have mentioned so far have been concerned with the examination of
uniformities in the distributions of primes in arithmetic progressions and primes which may be
represented by binary quadratic forms. The question on discrepancies in these distributions is
scarcely less interesting. Chebyshev already noticed that the number of primes below a given
bound and lying in the progression 4k + 1 is usually smaller than the number of primes lying
in the progression 4k + 3. The reason for this “bias” is not obvious as, by the corresponding
prime number theorem, the cardinalities of both sets are asymptotically equal. It has only been
recently that this discrepancy has been analysed in a more general setting.
We examine a similar “bias” which shows itself between primes of the shape x2 + ny 2 and
primes of the shape x2 + my 2 when the discriminants of these forms have the same class number
but a different number of distinct odd prime divisors.
The old masters like Fermat, Euler, Gauß and Dirichlet paid at least as much regard to
primes of the form x2 + ny 2 as to primes of the form a + nk. Likewise de la Vallée Poussin,
in his seminal work on the Prime Number Theorem, not only proved it in the ordinary form
for all prime numbers but he also gave the proofs of the prime number theorem for arithmetic
progressions as well as the prime number theorem for positive definite binary quadratic forms
in the same work. Since then, prime numbers that can be represented by specific binary quadratic forms have seen many important applications, e.g. as ingredients of certain factorization
algorithms, to name only one example. However, research on uniformities and discrepancies in
the distribution of such prime numbers has received much less attention than the corresponding
research on prime numbers in arithmetic progressions, which often proves to be easier. This
thesis aims to provide a modest step towards closing this gap.
Outline
Chapter 1 serves as an introduction to the fundamental results of the theory of binary quadratic
forms and the representability of primes by such forms of a fixed discriminant. We start the
chapter with some historical remarks and give a review of the primary notions of this theory.
The basis of the theory of binary quadratic forms was mainly established by Gauß in his
seminal Disquisitiones Arithmeticae. In particular, he fleshed out the definition of equivalence
classes of binary quadratic forms and the theory of composition of these forms (which both have
their origins in the work of Lagrange). In Section 1.1, we give a brief review of these concepts,
which led to the first results by Dirichlet, Weber and de la Vallée Poussin on the number of
primes that are representable by any given binary quadratic form.
In the subsequent Section 1.2, we summarize the relation between binary quadratic forms and
ideals in quadratic fields. This relationship has been essential for most analytic investigations
on primes and binary quadratic forms. In particular, Landau’s improvement of de la Vallée
Poussin’s prime number theorem for binary quadratic forms, which we also state in this section,
is built on this connection and our own results in Chapters 2 and 3 will also rely on it.
We pick up this topic in Section 1.3, in which we state the Chebotarev density theorem
and relate it to the prime number theorem for binary quadratic forms. The emphasis is laid on
conditional results that depend on appropriate versions of the Generalized Riemann Hypothesis;
they will be used in Chapter 3.
Preface
3
In Chapter 2, the main part of this thesis, we investigate the uniformity of the distribution of
primes “in” form classes (i.e., with respect to the representability by these classes) when the
discriminant of the classes varies over negative fundamental discriminants q 6≡ 0 (mod 8) and
we demonstrate that good error terms in the corresponding prime number theorem hold “on
average” (however, we do not achieve the above-mentioned conditional error terms).
We start with a review of analogous results for primes in arithmetic progressions in
Section 2.1.
The first original results are proved in Section 2.2, in which we find a large sieve inequality
for complex ideal class group characters. This inequality lies at the heart of the subsequent
two sections. Our restriction to fundamental discriminants q with q 6≡ 0 (mod 8) in this chapter
is mainly due to the cumbersome proof that this large sieve inequality would require for more
general discriminants.
In Section 2.3, we prove results of Bombieri–Vinogradov type for the counting function for
primes represented by binary quadratic forms as well as similar results for smooth versions of
appropriate Chebyshev functions and for special subsets of negative fundamental discriminants.
The results for the latter functions and sets show an interesting feature in that we end up with
more than the usual saving of just a power of a logarithm over “trivial” bounds. We also notice
that the results may be improved if we assume the Lindelöf Hypothesis for Rankin–Selberg
convolutions of holomorphic cusp forms of weight one.
The second type of uniformity results are of Barban–Davenport–Halberstam type; they
are the topic of Section 2.4. We show that general arithmetic functions exhibit an “average
behaviour” with respect to the representability of integers by form classes – for most form
classes to most discriminants in long ranges – if the functions satisfy Siegel–Walfisz conditions
for both arithmetic progressions and form classes (and an additional technical condition).
Two applications and an outlook on possible extensions and generalizations are provided in
Section 2.5. The first application deals with the question about the size of the least primes that
are representable by binary quadratic forms of a given discriminant. The second application is
a uniformity result for integers of the form k = x2 + ny 2 which are the product of two primes
that are representable by forms of discriminant −4n.
In spite of these uniformity results, frequency discrepancies still occur between primes of the
shapes x2 +ny 2 and x2 +my 2 for distinct positive integers n and m – even when their frequencies
show the same asymptotic behaviour. These discrepancies are the subject of the original results
of Chapter 3, which can be considered as a counterpoint to the results of Chapter 2.
Before we come to the new results, the so-called Chebyshev bias (or prime number race) for
primes in arithmetic progressions is reviewed in Section 3.1. Many important results on this
topic have been found only recently.
Previous research has also investigated an analogous bias in the distribution of prime ideals
in distinct ideal classes of a fixed imaginary quadratic field; we look at the corresponding results
in Section 3.2. Due to the close relation between ideal classes and form classes which we have
mentioned above, some of these results may be interpreted as prime number races for primes
represented by binary quadratic forms in different form classes of the same discriminant.
Finally, in Section 3.3, we demonstrate that there exists a bias in the distribution of primes
of the shapes x2 +ny 2 and x2 +my 2 when −4n and −4m are negative fundamental discriminants
with a different number of odd prime divisors but the same class number. Similarly to most
other recent results in comparative prime number theory, our results are conditional on the
Generalized Riemann Hypothesis and a linear independence hypothesis for the zeros of certain
class group L-functions and Dirichlet L-functions. In the proofs of the results of this section
– unlike the results of Chapter 2 – almost no major new difficulties arise that would require a
significant deviation from the proofs of results on Chebyshev’s bias for arithmetic progressions;
4
Preface
thus, after the initial setting of the scene, we mostly only stress the differences that occur and
do not repeat the proofs in full detail. We close the section with a list of questions and possible
extensions that could be investigated in future work.
About this Thesis
The research described in this dissertation was performed in the Department of Mathematics
(D-MATH) at ETH Zürich between December 2008 and October 2013, and was supervised
by Professor Dr. Emmanuel Kowalski.
Chapter 1 as well as Sections 2.1, 3.1 and 3.2 are of an expository nature and do not contain any
original results. The work presented in the remaining sections of this dissertation is original. It
is influenced primarily by the following earlier works:
• The proof of the large sieve inequality in Section 2.2 is based on a similar result of Duke
and Kowalski [DK00].
• The proof of the results in Section 2.3 follows roughly Gallagher’s method of proof for the
original Bombieri–Vinogradov theorem as presented in [Bom87].
• The proof of the results in Section 2.4 follows roughly the method of proof for the original
Barban–Davenport–Halberstam theorem and its generalizations as described in [IK04],
for example.
• The proofs of the results in Section 3.3 largely parallel some of the proofs of results on
Chebyshev’s bias in [RS94] and [FM13].
Jakob J. Ditchen
Zürich, Autumn 2013
Acknowledgements
I am greatly indebted to Professor Emmanuel Kowalski for suggesting the problems
from which this dissertation arose and for patiently guiding me with valuable advice.
I would also like to thank Professor Özlem Imamoglu and Professor Philippe Michel
for accepting to examine this thesis.
Many colleagues made the time that I spent at the Department of Mathematics at
ETH Zürich a very pleasant and inspiring one, for which I thank them.
Zutiefst dankbar bin ich schließlich auch meiner geliebten Andrea, die mir in ihrer
solch wundervollen und unterstützenden Art stets zur Seite steht, Martha und Izabela,
die mir großartige Schwestern sind, und meinen lieben Eltern, die mir den Weg zur
Mathematik sowie der Kunst des Lernens aufgezeigt haben und mich seit all den
Jahren in einzigartiger Weise unterstützen.
5
Notation
Notation
We list here the main notation, symbols and assumptions that will be used throughout this
thesis; many of them are standard in (analytic) number theory. The letters m and n denote
positive integers, q denotes a negative discriminant, q 0 and q 00 denote fundamental discriminants,
s denotes a complex number, Q and X denote positive real numbers and C denotes a form class
when used as parameters or arguments of functions in the definitions below.
Symbol
Meaning
ϕ(n)
µ(n)
π(X)
the Euler totient function
the Möbius function
the number of primes p 6 X
Z X
li(X)
Z
· ds
1
dt
2 log t
the integral on the line Re(s) = c in the complex plane
the logarithmic integral, i.e. li(X) =
(c)
Γ(s)
Λ(n)
ψ(X)
the gamma function
the von Mangoldt function
X
the Chebyshev function, i.e. ψ(X) =
Λ(n)
τ (n)
ω(n)
(m, n)
[m, n]
the
the
the
the
D
the set of all negative discriminants, i.e. the set of all negative
integers q satisfying q ≡ 0 (mod 4) or q ≡ 1 (mod 4)
the set of all negative fundamental discriminants; see (1.3)
the set of all negative fundamental discriminants q 6≡ 0 (mod 8)
the set of negative fundamental discriminants q ∈ F with |q| 6 Q
the subset of exceptional discriminants in F(Q); see Section 2.3.2
the form class group of binary quadratic forms with discriminant q
the form class which contains the principal form with discriminant q, i.e. the identity element of K(q)
the set of all positive integers which can be represented by all
forms f of the form class C ∈ K(q)
√
the ring of integers of Q( q)
the order of discriminant q in an imaginary quadratic field K; it
equals Oq if q ∈ F (which we will usually assume; see below)
the group of invertible fractional O(q)-ideals, i.e. the group of
invertible finitely generated O(q)-submodules of K
the subgroup of principal fractional O(q)-ideals
the quotient I(q)/P (q), i.e. the ideal class group of the order O(q)
the bijection K(q) → H(q) given in Lemma 1.4
the class number for the discriminant q, i.e. h(q) = |K(q)| = |H(q)|
the set of non-zero integral O(q)-ideals
the norm of the ideal a ∈ Z(q), i.e. the size of the quotient ring
O(q)/a (the dependence on q is suppressed)
the dual group of H(q), i.e. the set of (ideal) class group characters
b
the trivial character in H(q)
n6X
F
F
F(Q)
Fex (Q)
K(q)
C0 (q), C0
R(q, C)
Oq
O(q)
I(q)
P (q)
H(q)
Bq
h(q)
Z(q)
N(a)
b
H(q)
(q)
χ0
number of positive integer divisors of n
number of prime divisors of n counted without multiplicity
greatest common divisor of m and n
least common multiple of m and n
6
Notation
λχ (n)
χq 0
χq0 ,q00
L(s, λχ )
L(s, χ)
π(X; m, n)
π(X; q, C)
ψ(X; q, C)
ψk (X; q, C)
e(C)
κ(q)
w(C, n)
ν
π0 (X; n)
the sum
P
a∈Z(q) χ(a)
N(a)=n
b
for χ ∈ H(q);
here and throughout we set
χ(a) = χ(C), where C ∈ H(q) is the ideal class of a ∈ I(q)
0
the Kronecker symbol ( q· ) for the fundamental discriminant q 0 , i.e.
the primitive real Dirichlet character modulo |q 0 | if q 0 6≡ 0 (mod 8)
the real (ideal) class group character arising from the convolution
of the Dirichlet characters χq0 and χq00 ; see (2.69) and (2.98)
the L-function for the class group character χ
the L-function for the Dirichlet character χ (it may also denote a
class group L-function in the displayed sums in Section 3.3)
the number of primes p 6 X with p ≡ n (mod m)
the number of primes p 6 X such that for every form f in C ∈ K(q)
there exist x, y ∈ Z satisfying f (x, y) = p
the corresponding Chebyshev function
the smoothed and weighted Chebyshev function which is defined
in equation (2.22)
the constant which equals 1 if the form class C has order > 3 in
K(q) and equals 2 otherwise
the number of form classes C in K(q) with e(C) = 2, i.e. the
number of ambiguous classes; see equation (1.2)
the number of ideals a ∈ Bq (C) with N(a) = n; see Remark 1.8
a divisor frequency associated to a subset of F; see Definition 2.12
the number of primes p 6 X such that there exist x, y ∈ Z
satisfying p = x2 + ny 2
Complex numbers are generally denoted by s = σ + it with σ, t ∈ R.
Non-trivial zeros of L-functions are generally denoted by ρ = β + iγ with β, γ ∈ R.
When used as a variable, the letter p will always denote a rational prime and the Fraktur letter p
will always denote a prime ideal (of a ring which will be clear from the context).
Asymptotic notation:
For arithmetic functions f and g, we write f (x) = O(g(x)), or equivalently f (x) g(x), when
there is an absolute constant c such that |f (x)| 6 cg(x) for all values of x under consideration.
We usually write f (x) = Oα (g(x)) or f (x) α g(x) if the constant depends on some parameter α; we may suppress such dependencies if they are sufficiently clear from the context.
We write f (x) ∼ g(x) if lim f (x)/g(x) = 1, and f (x) = o(g(x)) when lim f (x)/g(x) = 0.
x→∞
x→∞
General assumptions on binary quadratic forms and discriminants:
All binary quadratic forms in this thesis are assumed to be integral, primitive and positive
definite; in particular, all discriminants of forms, orders and fields are negative integers q which
satisfy either q ≡ 0 (mod 4) or q ≡ 1 (mod 4).
In addition, all discriminants which will appear after Remark 1.7 – in particular, all
discriminants in Chapter 2 and Chapter 3 – are assumed to be fundamental discriminants,
i.e. each of these discriminants q is assumed to satisfy either
(a) q ≡ 1 (mod 4) with q squarefree, or
(b) q ≡ 0 (mod 4) with
q
4
≡ 2 or 3 (mod 4) and
q
4
squarefree.
Moreover, none of the discriminants in Chapter 2 will be a multiple of 8. In Chapter 3, we will
only consider forms of the shape x2 + ny 2 for positive integers n; note that the discriminant of
x2 + ny 2 is a negative fundamental discriminant if and only if n 6≡ 3 (mod 4) and n is squarefree.
Chapter 1
Primes represented by positive
definite binary quadratic forms
The history of questions about prime numbers of the shape x2 + ny 2 probably starts with
Fermat, who stated the following three assertions in letters to Mersenne in 1640 and to Pascal
in 1654: For all odd primes p,
∃ x, y ∈ Z : p = x2 + y 2
∃ x, y ∈ Z : p =
x2
∃ x, y ∈ Z : p =
x2
if and only if p ≡ 1 (mod 4),
+
2y 2
if and only if p ≡ 1 or 3 (mod 8),
+
3y 2
if and only if p = 3 or p ≡ 1 (mod 3).
(1.1)
Fermat claimed to have proofs for all these statements, but there is no evidence that this was
indeed the case. It took about one hundred years before Euler actually provided complete proofs
for Fermat’s assertions (see [Cox97, §1.1] and the references there).
Similar statements for primes of the shape x2 + ny 2 for particular values of n > 3 were
conjectured by Euler and most of them were later proved by Lagrange and Gauß by means of
quadratic reciprocity. However, it slowly became clear that congruence relations like these could
not exist for all positive integers n, even less for arbitrary binary quadratic forms. Only the
evolution of the theory of the composition of forms and form classes as well as the development
of ideal theory and the ensuing relation between classes of quadratic forms and ideal classes in
quadratic fields opened up the possibility to find conditions for the representability of primes
by general binary quadratic forms and statements for the frequency of such primes.
Before going into details in the upcoming sections, let us recall the basic definitions and fix
certain assumptions concerning binary quadratic forms: An integral binary quadratic form is a
homogeneous polynomial of the shape
f (x, y) = ax2 + bxy + cy 2
in two variables over the ring of rational integers. Such a form is called primitive if the
coefficients a, b and c share no common prime divisor; all forms in this thesis will be assumed
to be integral and primitive. We say that the binary quadratic form f represents an integer
m if there exist integers x and y such that f (x, y) = m. The discriminant of such a binary
quadratic form is defined to be q = b2 − 4ac; note that this implies q ≡ 0 or 1 (mod 4). If q = 0,
then the form can only represent squares of integers. If q > 0, then the form can represent
both positive and negative integers; such a form is called indefinite. If q < 0, then the form can
represent either only negative integers or only positive integers (depending on the sign of the
coefficient a); the form is accordingly called negative definite or positive definite.
8
Primes represented by positive definite binary quadratic forms
In this thesis we shall only deal with binary quadratic forms of negative discriminant (in fact,
only with positive definite forms, since we are interested in the representability of primes) as the
theory of these forms is considerably simpler than the theory of forms of positive discriminants.
The difference is basically due to the fact that, if the discriminant is negative, then the equation
k = ax2 + bxy + cy 2 (for fixed integers k, a, b and c) is the equation of an ellipse, which contains
only finitely many lattice points; it is, however, the equation of a hyperbola, which contains an
infinite number of lattice points, if the discriminant is positive. We quote Gauß:
Formae vero determinantium positivorum, quae tractationem prorsus peculiarem
requirunt, commentationi alteri reservatae manere debebunt1
and we also hope that the questions which are examined in this thesis for forms of negative discriminant will be investigated for forms of positive discriminant in another work. Consequently,
we will also state all classical results in this chapter for positive definite binary quadratic forms
only – regardless of the possible existence of analogous results for indefinite forms. In order to
avoid excessive repetitions of these assumptions, we adopt the following convention:
Convention: Whenever we will say form or quadratic form or binary quadratic form in this
thesis, we will mean a positive definite integral primitive binary quadratic form.
This introductory chapter intends to give a brief account of the basic, mostly classical results
in the prime number theory for binary quadratic forms of a fixed discriminant. It is organized
as follows: In the next section we start with a review of the theory of composition of form
classes of binary quadratic forms, which we basically owe to Dirichlet, but which has its origins
already in the works of Lagrange and Gauß who composed forms instead of form classes. It
was exactly this transition that was essential to answer a broad variety of questions on the
representability of integers by such forms and made it possible to prove general qualitative as
well as the first quantitative statements on the infinitude of primes that are representable by
any given primitive binary quadratic form. The convenience to work with classes became even
clearer after the introduction of the theory of “ideal numbers” by Kummer and its development
by Dedekind. The relation between form classes and ideal classes is exhibited in Section 1.2
and the resulting more precise quantitative prime number theorem is given. Advances in class
field theory (in the shape of the Chebotarev density theorem) and in analytic number theory
led to even better results, conditional and unconditional ones, which are covered in Section 1.3.
There exist plenty of excellent books and papers that present many of these topics in a
much more detailed and more elaborated way than it would be possible and appropriate to
provide here. The author profited particularly from the books Primes of the form x2 + ny 2
[Cox97], Zetafunktionen und quadratische Körper [Zag81] and The shaping of arithmetic after
C. F. Gauss’s Disquisitiones arithmeticae [GSS07] while writing this introductory chapter.
We would like to end this opening section by reminding the reader that binary quadratic
forms have found important applications in cryptography, which we cannot discuss here.
Extensive accounts of the underlying algorithms can be found in [BV07], [Bue89] and [Coh93],
for example.
1
“The forms with positive determinant [discriminant], which require a special treatment, must remain reserved
for other studies.”; quoted from the introduction to Gauß’s article De nexu inter multitudinem classium, in
quas formae binariae secundi gradus distribuuntur, earumque determinantem, which can be found in the second
volume of his collective works as well as in the cumulative German translation Untersuchungen über höhere
Arithmetik of his number-theoretic works. Similarly, de la Vallée Poussin stated that “tandis que l’extension
se fait naturellement à ces dernières [i.e., forms with negative discriminant], les formes de déterminant positif
exigent une analyse beaucoup plus compliquée” in the third part [dlVP96] of his Recherches analytiques sur la
théorie des nombres premiers in which he proved the prime number theorem for positive definite binary quadratic
forms; he provided the analysis for indefinite forms one year later, in the fourth part [dlVP97] of his work.
9
1.1 The composition of binary quadratic forms and form classes
1.1
The composition of binary quadratic forms and form classes
Gauß’s Disquisitiones Arithmeticae, published in 1801, are widely regarded as the beginning of
modern number theory. And about half of this work is devoted to the theory of binary quadratic
forms. Building on former definitions and results from Lagrange’s Recherches d’Arithmétique,
Gauß revealed here the importance and depth of the notion of equivalence of binary quadratic
forms and of the way in which these forms can be composed. Edwards [GSS07, §II.2.1] notes
that one of the purposes for Gauß to present the theory of composition in its full generality
was to give another proof of quadratic reciprocity – a simpler proof than the one he gave in an
earlier part of the Disquisitiones.
However, in order to be able to derive the laws and properties of the composition for arbitrarily given forms, long and complicated computations are necessary and the resulting composition is not even a binary operation. Later, Dirichlet “simplified” Gauß’s composition by
forfeiting the capability of composing arbitrary forms but contented himself with the ability to
compose certain forms which are equivalent to the given ones. Thus, his composition of forms is
really a composition of equivalence classes of forms. Yet, this kind of composition was sufficient
for his questions on the representability of numbers by binary quadratic forms – and therefore
it is also sufficient for the questions we will be concerned with in this thesis.
We now describe this method of composing equivalence classes of binary quadratic forms.
First of all, we must say what we mean by equivalence of quadratic forms: Two binary quadratic
forms f and g are (properly) equivalent if there exists an element
r s
t u
!
∈ SL(2, Z)
such that
f (x, y) = g(rx + sy, tx + uy)
for all x, y ∈ Z. A short calculation shows that equivalent forms have the same discriminant.
Moreover, it can be shown that equivalence of binary quadratic forms is indeed an equivalence
relation and the number of equivalence classes – which we call form classes – is finite for any
given discriminant of forms; see [Cox97, §2], for example. We denote the set of form classes of
forms with discriminant q by K(q) and its cardinality by h(q).
The main importance of this classification lies in the following fact (see [Zag81, §8], for
example): Equivalent forms represent the same numbers. Therefore, we may define the set
R(q, C) = n ∈ Z | ∀f ∈ C ∃x, y ∈ Z : f (x, y) = n
for every negative integer q ≡ 0, 1 (mod 4) and every form class C ∈ K(q).
We proceed to the definition of the composition of equivalence classes: Let
f (x, y) = ax2 + bxy + cy 2
and
g(x, y) = a0 x2 + b0 xy + c0 y 2
0
be two forms of negative discriminant q and assume that the coefficients a, a0 and b+b
2 have no
common prime divisor. Then the (Dirichlet) composition F of f and g is the form
F (x, y) = aa0 x2 + Bxy +
B2 − q 2
y ,
4aa0
where B is the unique integer modulo 2aa0 such that
B ≡ b (mod 2a), B ≡ b0 (mod 2a0 ) and B 2 ≡ q (mod 4aa0 ).
10
Primes represented by positive definite binary quadratic forms
See [Cox97, Lemma 3.2] for a proof of the uniqueness of B.
Now, one can show (see the references given in [Cox97, §3]) that this composition of special
forms induces a well-defined binary operation on K(q) and turns it into an abelian group – the
form class group of discriminant q – with order h(q) and the following identity element and
inverses: Given a negative integer q ≡ 0, 1 (mod 4), the principal form of discriminant q is
defined by
q
x2 − y 2 if q ≡ 0 (mod 4),
4
1−q 2
y
if q ≡ 1 (mod 4).
x2 + xy +
4
The form class containing the principal form is the identity element of the class group K(q) and
it is called its principal class; we denote it by C0 (q) (or simply by C0 if the corresponding
discriminant is clear from the context). The inverse of the class which contains the form
ax2 + bxy + cy 2 is the class which contains the form ax2 − bxy + cy 2 . We say that a class
is an ambiguous class if its order in K(q) is at most 2; any form in an ambiguous class is called
an ambiguous form.
Switching the attention from quadratic forms to their equivalence classes led to the advent of
the first results on the number of primes that may be represented by any given form. Dirichlet
already sketched a proof of the infinitude of primes representable by certain binary quadratic
forms in 1840, but it was Weber [Web82] who gave the first complete proof that held for all
primitive forms:
Theorem 1.1 (Weber). Every positive definite integral primitive binary quadratic form represents infinitely many primes.
De la Vallée Poussin is best known for his proof of the Prime Number Theorem (which was
independently proved by Hadamard at about the same time). This work [dlVP96] is even more
remarkable if one recalls that it not only contains the proofs of the ordinary prime number
theorem and the corresponding one for primes in arithmetic progressions, but he also proved
there:
Theorem 1.2 (De la Vallée Poussin). Let π(X; q, C) be the number of primes p 6 X that may be
represented by forms in the form class C of the form class group of the negative discriminant q.
Then
li(X)
π(X; q, C) =
· 1 + o(1)
as X → ∞,
e(C)h(q)
where e(C) = 2 if C is an ambiguous class and e(C) = 1 otherwise.
The original proofs of these theorems were quite long-winded. In the next section, we will see
how the groups K(q), which consist of arithmetic objects, can be linked to groups of algebraic
objects, which turn out to be more convenient to work with. This link led to shorter and more
precise forms of the above statements and will also be the basis of our results.
Remark 1.3. There exists another natural classification of binary quadratic forms, which is
also due to Gauß: We say that two forms of discriminant q lie in the same genus if they represent
the same values in (Z/qZ)∗ . Equivalent forms are always in the same genus, but the converse
is usually not true. The most important properties are:
(a) All genera of forms of discriminant q consist of the same number of form classes. If this
number is 1 and q = −4n for some positive integer n, then there exists a congruence
condition like (1.1) which characterizes the primes of the shape x2 + ny 2 .
11
1.2 Algebraic methods for arithmetic objects
(b) The genus containing the principal form is called the principal genus. It consists of the
squares in the form class group.
(c) Let q be the discriminant of a positive definite form. The number of genera of forms of
discriminant q is given by
κ(q) =


2ω(q)−1




2ω(q)−2

2ω(q)−1




 ω(q)
2
if
if
if
if
q
q
q
q
is
is
is
is
odd,
even and
even and
even and
q
4
q
4
q
4
≡ 1, 5 (mod 8),
≡ 2, 3, 4, 6, 7 (mod 8),
≡ 0 (mod 8),
(1.2)
where ω(q) denotes the number of distinct prime divisors of q. Both the number of
ambiguous classes in K(q) and the index of the subgroup (K(q))2 in K(q) are also equal
to κ(q).
The proofs of these properties can be found in [Cox97, §3]. We will come across genera and
the numbers κ(q) in Chapter 3, but for the most part of this thesis form classes will be more
important for us.
1.2
Algebraic methods for arithmetic objects
Kummer, in his endeavours to find a way to compensate the lack of unique factorization in the
rings of integers of cyclotomic number fields, introduced the notion of “ideal numbers” in 1847.
According to [GSS07, §II.2.1], Kummer was led to the definition of equivalence classes of these
numbers by the way Gauß had partitioned binary quadratic forms into classes. The intimate
relation to binary quadratic forms persisted when Dedekind generalized Kummer’s concept and
introduced the language of ideals.
Before we can state this connection explicitly, we have to fix the notation that we will use
for certain notions of algebraic number theory: Every quadratic number field K can be written
√
uniquely in the form Q( q) for a squarefree integer q 6= 0, 1. Its discriminant dK is given by
dK = q if q ≡ 1 (mod 4) and by dK = 4q otherwise. The union of the set of integers which are
discriminants of quadratic fields and {1} is called the set of fundamental discriminants. We will
denote the set of all negative fundamental discriminants by F, i.e.
F = d ∈ Z | d < 0, d ≡ 1 (mod 4) and d is squarefree
∪ d ∈ Z | d < 0, d ≡ 0 (mod 4) and
d
4
6≡ 1 (mod 4) is squarefree .
(1.3)
Furthermore, the set of all negative integers q ≡ 0, 1 (mod 4), i.e. the set of negative discriminants
of quadratic forms, will be denoted by D.
Let q ∈ D. Then there exist a unique positive integer r (which is called the conductor of q)
and a fundamental discriminant q0 ∈ F such that q = r2 q0 . Moreover, there exists a unique
√
order of discriminant q in Q( q0 ): Recall that an order O in a quadratic field K is any subring
of K containing 1 such that O is a finitely generated Z-module that contains a Q-basis of K.
For example, the ring of integers of K is always an order in K and, in fact, the maximal one.
The discriminant of any order O in K is the product of the square of the index of O in the ring
of integers times the discriminant of the field. For any given discriminant q ∈ D of a binary
√
quadratic form, we will only be interested in the order of discriminant q in Q( q0 ), which we
√
denote by O(q); the ring of integers of Q( q0 ) will be denoted by Oq . Note that O(q) equals Oq
if q ∈ F.
12
Primes represented by positive definite binary quadratic forms
For all q ∈ D we define:
• I(q), the group of invertible fractional O(q)-ideals, i.e. the group of invertible finitely
√
generated O(q)-submodules of Q( q0 ); note that fractional O(q)-ideals are usually not
ring ideals of O(q);
• P (q), the subgroup of principal fractional O(q)-ideals;
• H(q), the quotient I(q)/P (q), i.e. the ideal class group of the order O(q);
• Z(q), the set of non-zero integral O(q)-ideals;
• N(a), the norm of the ideal a ∈ Z(q), i.e. the size of the quotient ring O(q)/a (the
dependence on q is suppressed).
The algebraic properties of all these objects are explained in [Cox97, §5 and §7], for example.
At this point, we just recall that H(q) is always a finite abelian group and say, analogously to
the notion for form classes, that an ideal class K ∈ H(q) is ambiguous if K = K −1 in H(q).
Binary quadratic forms and ideal classes are linked through the following result, which is
due to Dedekind:
Lemma 1.4 (Dedekind). For every negative discriminant q, there exists an isomorphism
Bq : K(q) → H(q)
which is induced by the map that sends the binary
quadratic form f (x, y) = ax2 + bxy + cy 2 to
√
−b+ q
. In particular, we have
the ideal of O(q) that is generated by a and
2
h(q) = |K(q)| = |H(q)|.
Moreover, a positive integer m is represented by the positive definite binary quadratic forms
in the class C ∈ K(q) if and only if there exists an ideal a ∈ Bq (C) such that N(a) = m.
A proof can be found in [Cox97, Theorem 7.7], for example.
This relation helped to drive forward the development of algebraic number theory thanks
to the extensive theory Gauß had created on the arithmetic side of this bijection. On the other
hand, it turned out that many statements on binary quadratic forms can be proved in a simpler
way by using the amenities of the algebraic side.
Remark 1.5. A major drawback of orders is the fact that they are usually not Dedekind
domains (i.e., the factorization is not unique at the level of ideals); on the other hand, the ring
of integers Oq is always a Dedekind domain. However, it turns out that this problem is not
a severe one for most questions on primes represented by binary quadratic forms. In fact, if
q ∈ D with q = r2 q0 , where q0 ∈ F and r is a positive integer, we let Ir denote the group
of fractional Oq -ideals a satisfying a + rOq = Oq . Moreover, let Pr denote the subgroup of
Ir generated by principal ideals of the form αOq such that α ∈ Oq satisfies α = a (mod rOq )
for some integer a with (a, r) = 1. Then one can show (see [Cox97, §7]) that there exists an
eq : H(q) → Ir /Pr . In particular, it follows with Lemma 1.4 that a positive
isomorphism B
integer m satisfying (m, r) = 1 is represented by the positive definite binary quadratic forms in
bq (C) := B
eq (Bq (C)) such that |Oq /a| = m.
C ∈ K(q) if and only if there exists an Oq -ideal a ∈ B
We state the resulting qualitative information about the representability of integers by binary
quadratic forms of a given discriminant in a more explicit way:
13
1.2 Algebraic methods for arithmetic objects
Proposition 1.6. Let q be a negative discriminant and let C ∈ K(q) be a form class. Write
q = r2 q0 , where r is the conductor of q and q0 ∈ F.
(a) Let p be a prime which does not divide r. Then there exist x, y ∈ Z and a binary quadratic
form f of discriminant q such that f (x, y) = p if and only if either
(i) p ramifies in Oq , i.e. there exists a prime ideal p in Oq with pOq = p2 ; in this case
bq (C) and p is then
p may be represented by forms of the class C if and only if p ∈ B
representable by forms of the class C only; or
(ii) p splits in Oq , i.e. there exist distinct prime ideals p1 , p2 in Oq with
pOq = p1 p2 .
bq (C)
In this case p may be represented by forms of the class C if and only if p1 ∈ B
−1
b
b
b
or p2 ∈ Bq (C); in particular, p1 ∈ Bq (C) if and only if p2 ∈ (Bq (C)) , i.e. p is
representable exactly by forms of the classes C and C −1 .
(b) Let n be a positive integer which is coprime to r and let
n=
Y α Y βj Y γ Y δ
p i
r
sk
t`
i
j
k
`
be its prime factorization, where the first product is taken over all primes which split in
Oq and are representable by forms of ambiguous classes, the second product is taken over
all primes which split in Oq and are representable by forms of non-ambiguous classes, the
third product is taken over all primes which remain prime in Oq and the fourth product is
taken over all primes which ramify in Oq .
Denote the class which represents the prime pi by Cpi , the classes which represent the
prime rj by Crj and Cr−1
, the class which represents the square s2k by Csk (which must
j
always be the principal class) and the class which represents the prime t` by Ct` (which
must always be an ambiguous class). Then n is representable by forms of discriminant
q if and only if all the exponents γk are even and it is then representable by exactly the
classes of the form
Y
α
Cpij
Y
β −vj
Crjj
Y
Csγkk /2
Y
Ctδ`` =
Y
Cpαii
Y
β −vj
Crjj
Y
Ctδ``
for all tuples (vj ) of integers vj ∈ {0, 2, . . . , 2βj }.
Remark 1.7. Most of the classical results that we will present in this chapter are known to
hold for both non-fundamental and fundamental discriminants. It should also be possible to
prove many of the original results in Chapter 2 and Chapter 3 in a general form for both kinds
of discriminants – along the lines of the proofs that we will give for fundamental discriminants
only. However, we believe that the amount of additional technical details that are usually
necessary for general proofs – due to the peculiarities of the square factors of non-fundamental
discriminants – would often eclipse the main arguments. From now on, we will therefore restrict
our attention to fundamental discriminants. In particular, we may henceforth always assume
that O(q) = Oq .
14
Primes represented by positive definite binary quadratic forms
Remark 1.8. If a negative fundamental discriminant q and a form class C ∈ K(q) are given
and we want to estimate a sum of the form
X
g(n)
n6X
n∈R(q,C)
for some arithmetic function g, then Proposition 1.6 allows us to equivalently estimate the sum
X
X
n6X a∈Bq (C)
N(a)=n
g(n)
,
v(C, n)
where the weight function v(C, n) accounts for the fact that, in general, the number of ideals
a ∈ Bq (C) with norm n does not equal 1. Thus, if there exists in Bq (C) an ideal a with norm n,
then v(C, n) is the number of ideals a ∈ Bq (C) with N(a) = n. For further use, we also set
X
w(C, n) =
1.
(1.4)
a∈Bq (C)
N(a)=n
Note that v(C, n) remains undefined if there is no ideal a ∈ Bq (C) with N(a) = n, while
w(C, n) = 0 in this case; thus, we have w(C, n) = 0 if and only if n ∈
/ R(q, C), by Lemma 1.4.
Using Proposition 1.6, we may also give an expression for w(C, n) which does not use the
language of ideals but only the language of form classes: We have
w(C, n) = (vj ) : C =
Y
Cpαii
Y
β −vj
Crjj
Y
Y
Ctδ`` ·
(αi + 1);
(1.5)
the second factor arises from the (αi + 1) possibilities when choosing the prime ideals which lie
over each split prime pi that is representable by forms of an ambiguous class.
Remark 1.9. The arithmetic functions we will be most interested in are the characteristic
function for the set of rational primes and (smooth versions of) the von Mangoldt function. For
all (positive or negative) fundamental discriminants q, let χq denote the Kronecker symbol ( q· )
(see [IK04, §3.5] or [MV07, §9.3] for the explicit definition); it equals the unique primitive real
Dirichlet character modulo |q| if q 6≡ 0 (mod 8) (there are two primitive real Dirichlet characters if
q ≡ 0 (mod 8)). For each rational prime p, the number of solutions m (mod p) to m2 ≡ q (mod p)
equals 1 + χq (p) and one can easily show (see [Cox97, Proposition 5.16], for example):
• If χq (p) = 0, i.e. if p divides q, then p ramifies in O(q), i.e. pO(q) = p2 for some prime
ideal p of O(q) and N(p) = p;
• if χq (p) = 1, then p splits in O(q), i.e. pO(q) = p1 p2 for two distinct prime ideals p1 , p2 of
O(q) and N(p1 ) = N(p2 ) = p;
• if χq (p) = −1, then p remains prime in O(q), i.e. pO(q) = p is a prime ideal in O(q) and
N(p) = p2 .
Consequently, Proposition 1.6 implies that, if n = p` for a prime p and a positive integer ` and
if n can be represented by the forms in the class C ∈ K(q), then



6 ` + 1
w(C, n) = w(C, p` ) = 1


= 1
if χq (p) = 1,
if χq (p) = 0,
if χq (p) = −1 (and ` must be even).
(1.6)
15
1.2 Algebraic methods for arithmetic objects
Only a small set of primes ramifies in O(q). Thus, if the number w(C, p) is positive, it will
usually be given by w(C, p) = 2 if C is ambiguous, and w(C, p) = 1 otherwise. For further use,
we therefore put
(
2
if C is ambiguous,
e(C) =
(1.7)
1
if C is not ambiguous.
Note that we thus have
X
X
C∈K(q) p6X
w(C, p) =
X
e(C)
C∈K(q)
X
p6X
p∈R(q,C)
1−
X
p6X
p|q
1=
X
(1 + χq (p)).
(1.8)
p6X
In Chapter 2, we will be interested in questions of uniformity. That is, given a large real
number X and a “reasonable” arithmetic function g, we would like to know whether there exists
an estimate for
X
g(n)
(1.9)
n6X
n∈R(q,C)
which is uniform in (i.e., independent of) the choice of the form class C ∈ K(q) and the error
term of which is also uniform in the choice of the discriminant q. Thus, one would instinctively
expect a “reasonable” function to be a function which shows no obvious reason to favour any
classes or discriminants; the sum in (1.9) should, for any specific form class, therefore not differ
much from the average over all C ∈ K(q) of such sums, for all q in some large range. However,
the distinct behaviour of ambiguous and non-ambiguous classes in their capability to represent
primes, which is evident from Proposition 1.6 and the remarks above, shows that we usually
cannot expect estimates that are independent of the given form class. Nevertheless, if g is the
characteristic function for the set of prime numbers, for example, we may still hope that a
“uniformity up to the factors e(C)” holds.
Due to the close relation between forms and ideals, we may then ask the same question on
uniformity for sums of the type
X X
ge(n)
(1.10)
n6X
a∈C
N(a)=n
for (fundamental) discriminants q, ideal classes C ∈ H(q) and arithmetic functions ge. It turns
out that there is less reason to expect a significant dependence on the given class here.2 Analytic methods also often tend to cooperate better with algebraic objects like ideals than with
arithmetic objects like quadratic forms. Hence, chances are better to estimate a sum like (1.10)
g(n)
for a function ge(n) which is “usually” close to v(C,n)
(see Remark 1.8), then translate the result
by means of the bijection Bq to an estimate for (1.9) and hope that the term “usually” indeed
means “sufficiently often” in order to give an additional error term which is small (and still
uniform). We will often benefit from this procedure.
Leaving the uniformity in q aside (which will be the topic of Chapter 2), there exist classical
results which give uniformity in C ∈ H(q) in (1.10) for certain functions ge. For constant
functions we have:
2
Note, however, that this is basically only true if the inner sum in (1.10) is only over ideals which are products
of prime ideals that lie over split primes: By Remark 1.9, prime ideals that lie over ramified primes or over primes
that remain prime in O(q) may only be contained in an ambiguous class or in the principal class, respectively.
Conveniently, these prime ideals are rare (or have a relatively large norm) and are therefore negligible for most
of our considerations.
16
Primes represented by positive definite binary quadratic forms
Theorem 1.10 (The ideal theorem for ideal classes). Let q ∈ F with |q| > 4 and let C ∈ H(q).
Then
X X
πX
1 = p + Oq X 1/3 ).
|q|
n6X a∈C
N(a)=n
This was proved by Landau in 1918; see [Nar04, §7.4.13] and the references there.
Of even greater interest to us, when ge is the characteristic function for the set of primes, we
have
√
X X
li(X)
1=
+ Oq Xe−c log X
(1.11)
h(q)
p6X a∈C
prime N(a)=p
for all q ∈ F, all C ∈ H(q) and a constant c = c(q) > 0. This is a consequence of:
Theorem 1.11 (The prime ideal theorem for ideal classes). Let q ∈ F and let C ∈ H(q). Then
there exists a constant c = c(q) > 0 such that
X
1=
p∈C
N(p)6X
√
li(X)
+ Oq Xe−c log X ,
h(q)
(1.12)
where the sum on the left side is over prime ideals of O(q) only.
Landau proved a general version of this statement with a weaker error term in 1907. The
version above was shown by him in 1918 and the error term has been only slightly improved
since then; see [Nar04, §7.2 and §7.4.12] and the references there. Note that the left sides
of (1.11) and (1.12) may only differ by the prime ideals which lie over rational primes that
remain prime in O(q) (and this may only happen if C is the principal class). By Remark 1.9,
the norm of these prime√ideals is the square of the respective rational primes. Thus, their
contribution is less than X and therefore negligible in (1.11).
From (1.11) and our previous remarks, we easily derive:
Theorem 1.12 (Landau’s prime number theorem for binary quadratic forms). Let q ∈ F and
let C ∈ K(q). Then there exists a constant c = c(q) > 0 such that
π(X; q, C) :=
X
1=
p6X prime
p∈R(q,C)
√
li(X)
+ Oq Xe−c log X .
e(C)h(q)
Indeed, whenever a prime ideal of O(q) lies over a split prime p in Bq (C), there are, by
Proposition 1.6 and Remark 1.9, exactly e(C) prime ideals lying over p in Bq (C). Thus,
e(C)π(X; q, C) =
X
X
p6X
a∈C
N(a)=p
1 + O(log |q|),
where the error term takes into account potential prime ideals which lie over ramified primes,
i.e. whose norm divides q; note that there are ω(q) log |q| such prime ideals. This error term
is, of course, negligible in Theorem 1.12. Therefore, Theorem 1.12 follows from (1.11).3
3
It should be remarked that Landau [Lan14] gave a direct proof of Theorem 1.12 already in 1914 – without
using ideals and even with an absolute constant c.
17
1.3 The Chebotarev density theorem and conditional results
Remark 1.13. As for the distribution of all integers representable by forms in a given form
class C ∈ K(q), Bernays [Ber12] proved – without using ideals – that there exists a constant
b(q), which does not depend on C, such that
X
n6X
n∈R(q,C)
X
b(q)X
+ Oa,q
1/2
(log X)
(log X)1/2+a
1=
(1.13)
1
for every a < min h(q)
, 41 . Unless X is much larger than |q| (that is, unless |q| is smaller
than (log X)), an easy lattice point counting argument gives a better estimate: If f ∈ C, then
X
n6X
n∈R(q,C)
X
1 6 {(x, y) ∈ Z2 | f (x, y) 6 X} p + X 1/2 .
|q|
(1.14)
A proof can be found in [BG06, Lemma 3.1], for example; more precise results for all ranges
of X (relative to q) are given in Theorem 6 of the same paper.
Note that the corresponding statement for integers in any given reduced residue class a of
any given modulus q is completely trivial.
In the last section of this introductory chapter, we will review versions of Theorem 1.12 with
an explicit dependence on q as well as conditional results.
1.3
The Chebotarev density theorem and conditional results
For the investigations on discrepancies in the distribution of primes of the shape x2 + ny 2 in
Chapter 3, we will need precise information on the size of the error term in Theorem 1.12 when
a suitable version of the Generalized Riemann Hypothesis is assumed. Such information exists
in the literature in the form of (effective) versions of the Chebotarev density theorem.
This theorem, first published in 1923, was one of the milestones of algebraic number theory.
It provides quantitative information on the splitting behaviour of prime ideals in Galois extensions of number fields:
Theorem 1.14 (Chebotarev density theorem). Let K be a number field, let L be a Galois
extension of K and let C be a conjugacy class in the Galois group G = Gal(L/K).
Let P be the
L/K
set of prime ideals of K which are unramified in L. For each p ∈ P, let
be the Artin
p
symbol, i.e. the conjugacy class of Frobenius automorphisms in G corresponding to prime ideals
in L which divide p. Then, as X → ∞,
n
o |C|
L/K
X
e (X; L/K, C) := p ∈ P :
π
= C, N(p) 6 X =
+ o(1)
.
p
|G|
log X
See [Nar04, Theorem 7.30] and the references in §7.4.15 of that book.
The connection to our questions is given by the Artin reciprocity law in class field theory
(see [Cox97, §5, §8 and §9]): Let q ∈ D and C ∈ K(q). We have seen in the last section that
we can translate questions on primes which may be represented by forms in C to questions
on ideals in the ideal class group Bq (C) ∈ H(q). Moreover, one can show (see [Cox97, §9.A])
that there exists an isomorphism Bq0 : H(q) → Gal(L/K) between H(q) and the Galois group
√
Gal(L/K), where K = Q( q) and L is the ring class field of the order O(q). If q is a fundamental
discriminant – which we always assume –, then L is called the Hilbert class field of K and is
18
Primes represented by positive definite binary quadratic forms
the maximal unramified abelian extension of K. Since Gal(L/K) is abelian, every conjugacy
class contains only one element. Thus, Theorem 1.14 yields
e (X; L/K, Bq0 (Bq (C)))
π
=
1
X
+ o(1)
h(q)
log X
as X → ∞. With this information and Proposition 1.6, one can therefore recover Theorem 1.2
from Theorem 1.14.
Lagarias and Odlyzko [LO77] gave an explicit error term in Theorem 1.14. Using the connection which we have just described, their result yields:
Theorem 1.15 (Explicit prime number theorem for binary quadratic forms). Let q ∈ F and
let C ∈ K(q). Then there exists an absolute constant c > 0 such that
π(X; q, C) =
√
li(X)
li(X β )
+
+ O Xe−c h(q) log X ,
e(C)h(q) e(C)h(q)
where β is a possible Landau–Siegel zero4 of the Dedekind zeta-function of the Hilbert class field
√
of Q( q).
We will not use this theorem, because we will need a much stronger – and therefore conditional –
result to achieve useful statements in Chapter 3.
Lagarias and Odlyzko proved such a result that is conditional on the Generalized Riemann
√
Hypothesis for the Dedekind zeta-function ζL of the Hilbert class field L of K = Q( q). We
recall that ζL is given by
X 1
ζL (s) =
s
a N(a)
for all s ∈ C with Re(s) > 1, and this series admits an analytic continuation to C r {1} (see
[Neu92, §VII.5], for example); here the sum is over the non-zero ideals of the ring of integers
of L, and N(a) denotes the corresponding absolute norm of the ideal a.
Serre [Ser81] gave a slightly simplified version of the result in [LO77]: If the Dedekind zetafunction of the Hilbert class field has all its non-trivial zeros on the line Re(s) = 21 , then the
conditional prime ideal theorem
e (X; L/K, C) =
π
li(X)
+ O X 1/2 (log |q|X)
h(q)
(1.15)
holds for all q ∈ F and all conjugacy classes C ∈ Gal(L/K); compare [Ser81, Théorème 4 and
(20R)]. As in the proof of Theorem 1.12, we may conclude the following conditional prime
number theorem for binary quadratic forms:
Theorem 1.16 (Explicit prime number theorem for binary quadratic forms under GRH). Let
q ∈ F and let C ∈ K(q). Assume the Generalized Riemann Hypothesis for the Dedekind zeta√
function of the Hilbert class field of Q( q). Then
π(X; q, C) =
4
li(X)
+ O X 1/2 (log |q|X) .
e(C)h(q)
(1.16)
These hypothetical zeros are similarly defined as the Landau–Siegel zeros for Dirichlet L-functions that we
will discuss in Section 2.3.2; see [LO77, Theorem 1.3] for the precise definition and a reference to explicit and
effectively computable bounds for β. Thus, this result is indeed an explicit and effective version of Theorem 1.12.
19
1.3 The Chebotarev density theorem and conditional results
A conditional version of a formula that links sums of the values of ideal class group characters
at prime ideal powers with sums over the zeros of the corresponding L-functions up to a certain
height will also be necessary; such formulas are known as (approximate) explicit formulas. One
can derive a formula of this kind from the results in [LO77, §7]; this has already been done in
[LP92] (see the proof of Theorem 3.1 there). Before we state the result, we recall the notions
of (ideal) class group characters and the corresponding L-functions:
Remark and Definition 1.17. The results in [LO77, §7] are given for the Artin L-functions
which are attached to the characters of Gal(L/K). However, because H(q) and Gal(L/K)
are isomorphic, these results also hold for (ideal) class group characters, i.e. for the group
homomorphisms from H(q) to the unit circle in the complex plane.
b
We denote the set of the ideal class group characters for the discriminant q by H(q)
– which
(q)
is therefore the (Pontryagin) dual group of H(q) – and we write χ0 for the trivial character.
Overloading the notation, we define χ(a) := χ(C) if C ∈ H(q) is the ideal class of the non-zero
fractional ideal a. For further use, we also define
λχ (n) :=
X
χ(a)
a∈Z(q)
N(a)=n
b
for all χ ∈ H(q)
and all positive integers n.
b
The L-functions associated to the characters χ ∈ H(q),
the class group L-functions, are
given by
X χ(a)
X λχ (n)
L(s, λχ ) :=
=
N(a)s n>1 ns
a∈Z(q)
for Re(s) > 1 and each of these series has an analytic continuation to the whole complex plane
(q)
unless χ = χ0 when the continuation is meromorphic with a pole at s = 1 (see [Hec17, §4]).
We also note that the Dedekind zeta-function ζL may be written as the product of all Artin
L-functions which are attached to the irreducible characters of Gal(L/K) (see [Neu92, §VII.10]).
The isomorphism Bq0 : H(q) → Gal(L/K) then yields a product expansion of ζL (s) in terms
b
of the class group L-functions L(s, λχ ) of all χ ∈ H(q).
Since the functions L(s, λχ ) are entire
whenever χ is non-trivial (and have only a pole in s = 1 otherwise), the assumption of the
Generalized Riemann Hypothesis for ζL is therefore equivalent to the assumption that all class
group L-functions for the discriminant q have no non-trivial zeros off the line Re(s) = 12 .
By [LO77, §7] and [LP92, §3], a conditional approximate explicit formula for ideal class
group characters is therefore given by:
e denote the
b
Theorem 1.18. Let q ∈ F and let χ be a non-trivial character in H(q).
Let Λ
von Mangoldt function for powers of prime ideals in O(q) (see (2.36)). Let T > 2. Assume
the Generalized Riemann Hypothesis for the Dedekind zeta-function of the Hilbert class field of
√
Q( q). Then
e
ψ(X;
χ) :=
X
a∈Z(q)
N(a)6X
e
χ(a)Λ(a)
=−
X X 1/2+iγ
|γ|<T
1
2
+ iγ
+ O (log |q|) (log X) +
X(log XT )2 ,
T
where the sum γ is over the zeros 12 + iγ with |γ| < T of the L-function L(s, λχ ) that is
associated to the class group character χ.
P
20
Primes represented by positive definite binary quadratic forms
Remark. It will not escape the reader’s notice that the complete generality and depth of
neither the original Chebotarev density theorem nor its explicit versions by Lagarias–Odlyzko
and Serre is needed for our questions on primes represented by binary quadratic forms. It is,
of course, just a very special case of their result that we actually use. The proof of Lagarias
and Odlyzko is, as they also say, “a direct descendent of de la Vallée Poussin’s proof of the
prime number theorem”. It would therefore be certainly possible to deduce the results above by
making explicit either Landau’s or de la Vallée Poussin’s proof of the prime number theorems
for binary quadratic forms that we have mentioned in the preceding sections.
Remark 1.19. Moreover, we will not even use the explicit dependency on the discriminant
that is present in Theorem 1.16 and Theorem 1.18. Nevertheless, we decided to state these
results explicitly to make clear that they may be a starting point for an improvement of our
Theorem 3.5 on prime number races in terms of a completely explicit error term – after all,
the proofs in [LO77] should make it possible to give numerical values for the implicit absolute
constants in the theorems above.
Chapter 2
The average distribution of primes
represented by positive definite
binary quadratic forms with varying
discriminant
Among the most important and attractive questions in the theory of primes are those that ask
for uniformities in their distribution. The prime number theorem for binary quadratic forms,
Theorem 1.12, therefore raises the question whether the error term can be shown to be uniform,
that is, independent of the discriminant for discriminants in certain ranges.
As we have mentioned before, the prime number theorem for binary quadratic forms (in the
form of Theorem 1.2 with non-explicit error term) is as old as the corresponding theorem for
primes in arithmetic progressions. The investigation of uniformity in the distribution of primes
in arithmetic progressions has formed an area of extensive research in the past century. On the
other hand, there exist hardly any uniformity results for binary quadratic forms and so we aim
to reduce this deficit in this chapter.
Uniformity results – with respect to the discriminants or the moduli – for primes represented
by binary quadratic forms or primes in arithmetic progressions can be basically classified into
the following four types:
(i) for all discriminants or moduli of a certain range and all corresponding form classes or
residue classes;
(ii) for almost all discriminants or moduli of a certain range and all corresponding form classes
or residue classes:
(iii) for almost all discriminants or moduli of a certain range and any fixed form class or residue
class;
(iv) for almost all discriminants or moduli of a certain range and almost all corresponding
form classes or residue classes.
Results of all these types exist for primes in arithmetic progressions. The strongest known
theorems of each type are
(i) the Siegel–Walfisz theorem;
(ii) the Bombieri–Vinogradov theorem;
22
The average distribution of primes represented by binary quadratic forms
(iii) the Fouvry–Iwaniec / Bombieri–Friedlander–Iwaniec theorems;
(iv) the Barban–Davenport–Halberstam theorem.
The strength of the underlying assumptions decreases from (i) to (iv) and therefore the known
admissible ranges for the moduli increase from (i) to (iv). We will review all these results in
the next section.
The only hitherto known result of this kind for primes represented by binary quadratic forms
has been of type (i):
Theorem 2.1 (Blomer, [Blo04a, Lemma 3.1]). For any A > 0, there exists a constant
c = c(A) > 0 such that
π(X; q, C) =
√
li(X)
+ O Xe−c log X
e(C)h(q)
uniformly for all negative integers q ≡ 0 or 1 (mod 4) with |q| 6 (log X)A and all C ∈ K(q).
We are not aware of any prior results of types (ii)–(iv) for primes represented by binary
quadratic forms and will therefore prove results of type (ii) in Section 2.3 and results of type (iv)
in Section 2.4. Type-(iii)-results that go beyond the range of the corresponding type-(ii)-results
are usually much deeper and will be left aside in this work. Appropriate large sieve inequalities
lie at the heart of results of types (ii)–(iv). Such an inequality for complex class group characters
will be proved in Section 2.2. We go without an analogous inequality for real class group
characters as these can be regarded as convolutions of Dirichlet characters and may be controlled
in a simpler way. We close this chapter in Section 2.5 with two easy applications and a list of
open problems.
All discriminants in this chapter are assumed to be negative fundamental discriminants
which are not integer multiples of 8 (this last restriction will be used to greatly simplify the
proof of Lemma 2.8, but it does not appear to be a crucial condition to make the proof work);
we denote this set of discriminants by F.
2.1
Mean-value results for primes in arithmetic progressions
Davenport [Dav00] calls Dirichlet’s 1837 memoir, entitled Beweis des Satzes, daß jede unbegrenzte arithmetische Progression, deren erstes Glied und Differenz ganze Zahlen ohne gemeinschaftlichen Factor sind, unendlich viele Primzahlen enthält, the beginning of analytic number
theory. In this paper Dirichlet gave the first correct proof of the infinitude of primes in arithmetic progressions whose difference, i.e. the modulus, and the first element, i.e. the reduced
residue class, are coprime. In 1896, Hadamard and independently de la Vallée Poussin (in the
second part of his comprehensive work [dlVP96] that we already mentioned in Chapter 1) proved
1
that π(X; q, a), the number of primes p 6 X satisfying p ≡ a (mod q), equals ϕ(q)
+ o(1) logXX
as X → ∞ if q and a are positive integers with (a, q) = 1; that is, the primes are uniformly
distributed in reduced residue classes with respect to a fixed modulus. Landau was the first to
give an explicit error term and the essentially best known result for π(X; q, a) is still due to
Page who proved in 1935 that
π(X; q, a) =
√
li(X) χex (a)X βex
−
+ Oq Xe−c log X
ϕ(q)
ϕ(q)βex
for all q and a with (a, q) = 1; here c is a positive (effectively computable) constant and the
second term is only present when there is an exceptional character χex modulo q and βex is the
2.1 Mean-value results for primes in arithmetic progressions
23
corresponding Landau–Siegel zero (see Section 2.3.2 for the definition of such characters and
zeros). Shortly after this result of Page, estimates which are uniform in the modulus began to
emerge:
1. Walfisz used a result of Siegel on lower bounds for the size of Dirichlet L-functions at the
point 1 to get a uniformity result of type (i):1
Theorem 2.2 (Siegel–Walfisz theorem). Let A > 0. There exists a number c = c(A) > 0 such
that
√
li(X)
π(X; q, a) =
+ O Xe−c log X ,
ϕ(q)
uniformly for all pairs of positive integers a and q with (a, q) = 1 and q 6 (log X)A .
Thus, this result is uniform in q, however only in a very small range of moduli. Moreover, c is
a positive, but not effectively computable constant depending only on A. That is, the proof
does not enable us to calculate the constant as long as we neither know that there do not exist
Landau–Siegel zeros nor know the Dirichlet L-functions which have such an exceptional zero.
Which error terms and therefore which ranges of uniformity can one expect? The Generalized
Riemann Hypothesis for Dirichlet characters implies the much better error term O(X 1/2 (log X)2 )
and there is reason to believe (see [MV07, Conjecture 13.9]) that the true size of the error term
is
!
X 1/2+ε
O
(2.1)
q 1/2
for q 6 X. On the other hand, Friedlander and Granville [FG92] – building on an idea of
Maier – showed that the asymptotic formula
π(X; q, a) ∼
π(X)
,
ϕ(q)
for any fixed a, cannot hold uniformly in the range q 6 X(log X)−A for any fixed A > 0.
2. If one relinquishes the premise to find uniformity for both all moduli in a large range
and all corresponding residue classes, then one can fill much of the space between the Siegel–
Walfisz theorem and the result of Friedlander and Granville: The enhancements of Linnik’s
large sieve in the 1960s (see Section 2.2) led to unconditional results which show that these
error terms do hold “on average” (in a suitable sense). The most influential result of this kind
was the following type-(ii)-result, independently proved by Bombieri [Bom65] and Vinogradov
[Vin65, Vin66], which shows that the prime number theorem for arithmetic progressions holds
uniformly for almost all arithmetic progressions modulo q 6 Q with Q only slightly smaller
than X 1/2 .
Theorem 2.3 (Bombieri–Vinogradov theorem). For each A > 0, there exists a number
B = B(A) such that
X
q6Q
li(X) max π(X; q, a) −
A X(log X)−A
ϕ(q) a6q
(a,q)=1
for Q 6 X 1/2 (log X)−B .2 The implied constant is not effectively computable.
1
References to this and the other above-mentioned classical results as well as a more detailed exposition of
their history can be found in the chapters 2.1, 5 and 6.2 of [Nar00].
2
Vinogradov’s result was actually slightly weaker in that his admissible range was Q 6 X 1/2−ε for any
arbitrarily small ε > 0. See also [HR11, p. 127] for references to weaker results of this type which had been
previously proved by Rényi, Barban and Pan.
24
The average distribution of primes represented by binary quadratic forms
Although this bound is only a power of a logarithm smaller than what the trivial estimate
π(X; q, a) 6 Xq + 1 would give, the Bombieri–Vinogradov theorem can replace the Generalized Riemann Hypothesis in many cases. This is the case, for instance, for the Titchmarsh
divisor problem, i.e. the question of determining the asymptotic behaviour of the function
P
p6X τ (p + a) for any fixed integer a, and the proof of the infinitude of primes p such that
p + 2 has at most three prime factors; see §3.5 and §9 of [HR11]. Lately, it was used to prove
−pn
3
the remarkable fact that lim inf n→∞ pn+1
log pn = 0, where pn denotes the n-th prime [GPY09].
It is worth mentioning that the conjectured error term (2.1) suggests that the Bombieri–
Vinogradov theorem should hold up to Q 6 X 1−ε for all ε > 0; this is the Elliott–Halberstam
conjecture.
Analogues of the Bombieri–Vinogradov theorem have been proved in various contexts. With
regard to the connection between classes of binary quadratic forms and ideal classes of quadratic
fields that we will exploit in Section 2.3, it is important to note that many results of Bombieri–
Vinogradov type have also already been proved for number fields by Wilson (1969), Huxley
(1971), Fogels (1972), Johnson (1979) and Hinz (1988)4 , but all these results have examined
cases in which the number field is fixed; this is not useful in our case. The only results that
have hitherto been proved for varying number fields are [MM87] and the recent generalization
[MP13]; the fields in these works are of the form K(ζq ) where K is a fixed number field, ζq is a
primitive q-th root of unity and q varies. This case is also quite different to what we will prove
in Section 2.3.
We will not discuss the proof(s) here, since our variants for binary quadratic forms in
Section 2.3 will roughly follow the proofs for arithmetic progressions. We just remark that the
two main ingredients of all known proofs (except for Vinogradov’s proof, which is somewhat
different and harder to describe) are the Siegel–Walfisz theorem, which handles the initial range
of moduli, and the use of the large sieve inequality, which we will discuss in the next section.
The method of proof still changed much over the years: In the original proofs of Bombieri and
Vinogradov the theorem was a consequence of zero-density results for Dirichlet L-functions;
Gallagher [Gal68] then found a way to omit any discussion of zeros. Nowadays bilinear forms
are at the heart of the proofs; see [IK04, §17] or [FI10, §9.8], for example.
3. A systematic use of bilinear forms was also one of the keys to the type-(iii)-results
of the 1980s. Bombieri, Deshouillers, Fouvry, Friedlander and Iwaniec (in alphabetical, but
not chronological order; moreover, working together in various combinations) were the main
figures who introduced new deep combinatorial techniques and the use of Kloosterman sums to
problems on the distribution of primes in arithmetic progressions. We just quote two of their
type-(iii)-results and refer the reader to [Bom87, §12] and [FI10, §22.2] for an exposition of and
references to their results.
Theorem 2.4 (Fouvry–Iwaniec / Bombieri–Friedlander–Iwaniec theorems). Let a 6= 0.
(a) There exists an absolute constant B > 0 such that
X
Q6q<2Q
(a,q)=1
−B
if Q 6 X 1/2+(log log X)
3
X
π(X; q, a) − li(X) a
ϕ(q)
(log X)(log log X)B
.
This result has seen an astonishing recent improvement, which we state at the very end of this section. The
proof of this advancement does not use the Bombieri–Vinogradov theorem, but it relies instead on deep methods
similar to the ones which are used in the proofs of the type-(iii)-results that we mention below.
4
See the references given in [Nar04, §7.4.12].
2.2 A large sieve inequality for complex ideal class group characters
25
(b) Let ε > 0 and Q = X 4/7−ε . Let λ be an arithmetic function such that, for any Q1 , Q2 > 1
with Q1 Q2 = Q, there exist two arithmetic functions λi (i = 1, 2) with λi (n) = 0 if n > Qi ,
|λi (n)| 6 1 for all n > 1 and λ = λ1 ∗ λ2 (the Dirichlet convolution).5 Then
X
q>1
(a,q)=1
li(X)
λ(q) π(X; q, a) −
ϕ(q)
a,A,ε X(log X)−A
for all A > 0.
Notably, the range of moduli under consideration crosses the square-root barrier and therefore the realm of the Generalized Riemann Hypothesis.
4. Finally, the range of the moduli can be made even longer and the proof of the following
result is even the easiest one of the theorems that we quoted in this section. We pay for this
larger range by giving up some control over the residue classes because only a (square) mean
over residue classes is considered, i.e. it is a result of type (iv).
Theorem 2.5 (Barban–Davenport–Halberstam theorem). For each A > 0, there exists a number B = B(A) such that
X
q6Q
X a6q
(a,q)=1
π(X; q, a) −
li(X)
ϕ(q)
2
A X 2 (log X)−A
for Q 6 X(log X)−B . The implied constant is not effectively computable.
This was proved by Barban and independently by Davenport and Halberstam in the 1960s.
The proof is again based on the Siegel–Walfisz theorem as well as the large sieve inequality
and it is similar to the proof of the Bombieri–Vinogradov theorem, but most of the technical
difficulties of the latter proof do not arise here. Montgomery – who notably did not use any
kind of large sieve inequality but a deep result of Lavrik on twin primes on average – and Hooley
replaced the inequality in the theorem by an asymptotic formula.6
Although the double-average nature of the Barban–Davenport–Halberstam theorem is usually not as useful in applications as the Bombieri–Vinogradov theorem, variations of it have
cameo parts in the proofs of two fascinating results in prime number theory: in the work [FI98]
of Friedlander and Iwaniec on the infinitude of primes of the form x2 + y 4 and in Zhang’s work
[Zha13] on the infinitude of primes for which the gap to the next prime is bounded by an explicit
absolute constant.
2.2
A large sieve inequality for complex ideal class group
characters
Initiated by Linnik in the early 1940s, the large sieve had its first heyday during the 1960s and
early 1970s, being essential in the proofs of seminal results like the already mentioned Bombieri–
Vinogradov theorem on the distribution of primes in arithmetic progressions or Chen’s theorems
towards the Goldbach conjecture and the Twin Prime conjecture (see Chapter 11 in [HR11]).
Bombieri, Friedlander, Iwaniec, Fouvry and others revived its importance in the 1980s with
the type-(iii)-results that we have discussed in the previous section.
5
Arithmetic functions with this property are said to be well-factorable.
See the first paper [Hoo75] of Hooley’s “On the Barban–Davenport–Halberstam theorem”-series (the count
of which reached XIX by 2007) for the last-mentioned result and references to the earlier ones.
6
26
The average distribution of primes represented by binary quadratic forms
Recently, new fascinating applications to the theory of automorphic forms, but also in
arithmetic geometry, ergodic theory and other areas were found; see [IK04, §7] and [Kow08]
for these late developments. In addition to the already mentioned books, we would also like to
cite the very early excellent account [Bar66] on the applications of the large sieve in analytic
number theory.
The general form of large sieve inequalities is as follows: For a given finite set Y of
“harmonics” that are defined on a range {1, . . . , N } of integers,
the large
sieve provides a
P
2
P
constant c = c(Y, N ) > 0 such that the inequality y∈Y
ckak2 holds for
n6N an y(n)
N
all complex vectors a = (a1 , . . . , aN ) ∈ C . The appropriate harmonics for problems involving
primes in arithmetic progressions are the Dirichlet characters. Hence, the results of the preceding section are built on this large sieve inequality for Dirichlet characters:
Lemma 2.6 (Large sieve inequality for Dirichlet characters). For any positive integers Q and
N and any complex numbers (an )n6N , we have
X
2
X
X∗ X
|an |2 ,
an χ(n) 6 (Q2 + N )
n6N
q6Q χ (mod q) n6N
where
X∗
means that the sum is taken over primitive Dirichlet characters only.
A slightly stronger form of this inequality is proved in [IK04, Theorem 7.13].
Due to the close relationship between form class groups and ideal class groups (see
Section 1.2), the ideal class group characters, which we have introduced in Remark 1.17, are the
essential harmonics for mean-value estimates for primes represented by binary quadratic forms.
Real class group characters arise from convolutions of real Dirichlet characters (compare
Section 2.3.4) and will be handled by means of Lemma 2.6. This is not possible for complex
class group characters and therefore the following large sieve inequality for such characters will
play a major role in the proofs of our variants of Bombieri–Vinogradov and Barban–Davenport–
Halberstam type results.
Lemma 2.7. Let F(Q) denote the set of negative fundamental discriminants q 6≡ 0 (mod 8)
(q)
b 1 (q) = {χ ∈ H(q)
b
satisfying |q| 6 Q. Set H
| χ2 6= χ0 } for all q ∈ F(Q) and
b 1 ([Q]) =
H
[
b 1 (q).
H
q∈F(Q)
b 1 ([Q]) and each positive integer n, set
For each χ ∈ H
λχ (n) =
X
χ(a).
a∈Z(q)
N(a)=n
Then
X
X
2
X
an λχ (n) ε N (log N )3 + N 1/2 (log N )Q5/2+ε
|an |2
b1 ([Q]) n6N
χ∈H
(2.2)
n6N
for all complex numbers (an )n6N and all ε > 0, Q > 1 and N > 3.
This result could essentially be regarded as a consequence of a mean-value estimate for
automorphic representations by Duke and Kowalski [DK00, Theorem 4]: Since the numbers
λχ (n) are coefficients of holomorphic cusp forms (for complex characters χ; this is not true for
real class group characters) one might consider the correspondence between holomorphic cusp
2.2 A large sieve inequality for complex ideal class group characters
27
forms and automorphic representations (see [Kud03] or [Gel75], for example) and thus set out to
deduce our result from theirs. However, we feel that using such a general result would disguise
the fact that our result does not require any modern tools from the theory of automorphic
representations. Moreover, it seems that a direct deduction from [DK00, Theorem 4] would
require the involved modular forms to be of weight at least two while we need a bound for
coefficients of weight-one cusp forms. Therefore, although we will in essence follow the proof of
Duke and Kowalski, we will give a proof of Lemma 2.7 that uses only “classical” results about
holomorphic cusp forms.
Their proof (and therefore also ours) rests upon three principles:
(i) The duality principle: This method basically exploits the fact that the norm of a bounded
operator on a Hilbert space is the same as the norm of its adjoint and thus allows us to
interchange the sums over χ and n.
(ii) The technique of smoothing sums: When the asymptotic evaluation of sums of the type
P
out that we can transform the problem at hand to
n6N a(n) is difficult, it often turns P
deal with smoothed sums of the type n>1 a(n)φ(n), where φ is a smooth function which
decays very fast for n larger than N . The smoothed sums can then often be understood
by basic properties of harmonic analysis and the last ingredient:
(iii) Rankin–Selberg theory: The underlying Rankin–Selberg method is a tool that allows one
to obtain the meromorphic continuation and a functional equation for the Mellin transform
of the constant term in the Fourier development of an automorphic function. As a result
P
of this, the meromorphic continuation and a functional equation of n>1 λχ (n)λχ0 (n)n−s
b 1 ([Q]).
can be found for all χ, χ0 ∈ H
In our case, the properties of the involved Rankin–Selberg convolutions do not depend on
deep facts from the theory of automorphic representations. Instead, our proof will make
use of Li’s functional equation for L-functions associated to Rankin–Selberg convolutions
of holomorphic cusp forms [Li79].
Proof of Lemma 2.7. Let φ be a smooth majorant of the characteristic function of the interval
[0, N ], i.e. a positive C ∞ function on [0, +∞) with compact support, 0 6 φ 6 1 and φ(n) = 1
b 1 (q1 ), χ2 ∈ H
b 1 (q2 ) with q1 , q2 ∈ F(Q), let χ1,2 be the product of the
for n 6 N . For all χ1 ∈ H
(unique) primitive real Dirichlet characters modulo |q1 | and |q2 |; χ1,2 is therefore a real Dirichlet
character modulo the least common multiple of q1 and q2 . Set
SN (χ1 , χ2 ) =
X
λχ1 (n)λχ2 (n)φ(n/N ),
n>1
L(s; χ1 , χ2 ) =
X
λχ1 (n)λχ2 (n)n−s ,
n>1
LRS (s; χ1 , χ2 ) = L(2s, χ1,2 )L(s; χ1 , χ2 )
for all s ∈ C with Re(s) > 1. The first L-function is the “naïve” convolution L-series of λχ1 (n)
and λχ2 (n) (which equals λχ2 (n) for all integers n and is therefore real although χ2 is not a real
character – a fact that we do not use in this chapter); the second L-function is known as the
Rankin–Selberg convolution L-function. By the Mellin inversion theorem, we have
1
SN (χ1 , χ2 ) =
2πi
1
N φ(s)L(s; χ1 , χ2 ) ds =
2πi
(2)
Z
sb
where
Z +∞
b
φ(s)
=
0
Z
b
N s φ(s)
(2)
φ(x)xs−1 dx
LRS (s; χ1 , χ2 )
ds,
L(2s, χ1,2 )
(2.3)
28
The average distribution of primes represented by binary quadratic forms
denotes the Mellin transform of φ (see [Kow04, §2.3] for this and the following basic properties
of smooth cutoff functions and Mellin transforms). We would like to shift the line of integration
on the right-hand side of (2.3) as far to the left as possible. Herefore, we need to know the
growth behaviour of the functions in this integral: By the choice of φ, its Mellin transform φb
decays faster than any polynomial in all vertical strips of the complex plane. Furthermore, we
have
1
1
ζ(2σ + 2it) (2.4)
L(2(σ + it), χ1,2 )
2σ − 1
uniformly in t ∈ R if σ > 12 . As for the Rankin-Selberg L-function LRS (s; χ1 , χ2 ), we consider
the functions
X
fj (z) =
λχj (n)e2πizn
(j = 1, 2)
(2.5)
n>1
on the complex upper half plane. Since the involved class group characters χj are not real, we
know (see [IK04, §14.3] or [BGHZ08, §4.3], for example) that the functions fj are normalized
primitive holomorphic cusp forms of weight one, level qj and nebentypus χqj (i.e., the primitive
real Dirichlet character modulo |qj |). Therefore we also know from classical Rankin–Selberg theory (see [Li79, Theorem 3.1]) that LRS (s; χ1 , χ2 ) is an entire function if f1 6= f2 or, equivalently,
if χ1 6= χ2 . In this case, it is therefore possible to shift the line of integration to Re(s) = 12 + α
with α = (log N )−1 . Thus,
Z
SN (χ1 , χ2 ) (1/2+α)
b
N s φ(s)
LRS (s; χ1 , χ2 ) ds.
L(2s, χ1,2 )
The Rankin–Selberg L-function satisfies a functional equation which relates LRS (s; χ1 , χ2 ) with
LRS (1 − s; χ1 , χ2 ) and we may deduce the convexity bound
LRS (1/2 + α + it; χ1 , χ2 ) ε (q1 q2 (1 + |t|)2 )1/2−α+ε
(2.6)
for every ε > 0 and all t ∈ R; we postpone the proof to Lemma 2.8, where we prove a slightly
sharper bound. By the fast decay of φb and (2.4), we thus get
SN (χ1 , χ2 ) ε N 1/2 (log N )Q1+ε
(2.7)
if χ1 6= χ2 .
Remark. One could try to remove the extraneous ε from the exponent: Heath-Brown [HB09],
using not much more than a modified variant of the classical Jensen formula, recently showed
that one can eliminate the ε from the exponent in the convexity bounds for general (Selberg
class) L-functions on the critical line Re(s) = 21 . Thus, choosing α = 0 above would expunge the
ε in (2.6). This would require us to give a lower bound for L(1 + it, χ1,2 ), which would reinsert
a factor of (q1 q2 )ε by Siegel’s theorem whenever χ1,2 is an exceptional character; however,
dealing with these exceptional cases separately – similar to Proposition 2.20 – it should be
possible to show that these are rare events and that they contribute negligibly (but would lead
to an ineffective overall result). Since this would only be a cosmetic, but not a substantial
improvement of the result at hand, we will not delve into that.
Also note that hitherto existing subconvexity bounds for Rankin–Selberg convolutions either
require that one of the two involved cusp forms is fixed [HM06] or that one cusp form has a
much smaller level than the other [HM13]. Although one may hope that more general results
will be obtained in the future, these will probably only slightly improve our results (due to the
saving of probably only a tiny power of the conductor) and will therefore be less important for
us than for other applications.
The best bound one could hope for in (2.6) is provided by the Lindelöf Hypothesis. We will
state the resulting large sieve inequality in Lemma 2.10.
29
2.2 A large sieve inequality for complex ideal class group characters
b 1 (q), we use the bound
If χ1 = χ2 ∈ H
X
|λχ (n)| 6
16
Y
(v + 1) = τ (n),
(2.8)
pv ||n
a∈Z(q)
N(a)=n
where the second inequality is due to the fact that each prime divisor p of n splits into at most
√
two distinct prime ideals in the quadratic field Q( q). Therefore
X
SN (χ1 , χ1 ) 6
τ (n)2 φ(n/N ) N (log N )3 ,
(2.9)
n>1
where the implied constant is absolute (see [MV07, (2.31)], for example).
b 1 ([Q]), it remains to use a
Now that we have bounded SN (χ1 , χ2 ) for all pairs χ1 , χ2 ∈ H
simple positivity argument and the duality principle in order to get the bound (2.2), which we
b 1 ([Q]),
originally set out to prove: For all complex numbers bχ , indexed by the characters χ ∈ H
the positivity of φ yields
X 2
X
bχ λχ (n) 6
6
62
max
b1 ([Q])
χ2 ∈ H
bχ λχ (n) φ(n/N ) =
X
bχ1 bχ2 SN (χ1 , χ2 )
b1 ([Q])
χ1 ,χ2 ∈H
X
|SN (χ1 , χ2 )||bχ1 ||bχ2 | 6
b1 ([Q])
χ1 ,χ2 ∈H
2
X
n>1 χ∈H
b1 ([Q])
n6N χ∈H
b1 ([Q])
X
X |SN (χ1 , χ2 )|(|bχ1 |2 + |bχ2 |2 )
b1 ([Q])
χ1 ,χ2 ∈H
X
|SN (χ1 , χ2 )|
b1 ([Q])
χ1 ∈ H
X
|bχ2 |2 .
b1 ([Q])
χ2 ∈ H
We insert the bounds (2.7) and (2.9) into the right-hand side of this inequality and note that
b 1 ([Q])| 6
|H
X
h(q) X
|q|1/2 (log |q|) Q3/2 (log Q)
(2.10)
q∈F(Q)
q∈F(Q)
by the upper class number bound
h(q) |q|1/2 (log |q|),
(2.11)
which follows from the bound L(1, χq ) log |q| (see [MV07, Lemma 10.15], for example) and
Dirichlet’s class number formula
h(q) =
w|q|1/2 L(1, χq )
,
2π
(2.12)
where w = 6 if q = −3, w = 4 if q = −4 and w = 2 if q < −4 (see [Dav00, §6], for example).
Thus the bound
X X
n6N χ∈H
b1 ([Q])
2
bχ λχ (n) ε N (log N )3 + N 1/2 (log N )Q5/2+ε
X
|bχ |2
b1 ([Q])
χ∈H
holds for all tuples (bχ )χ∈Hb ([Q]) of complex numbers. By the duality principle (see [IK04, p. 171],
1
for example), this is equivalent to the statement of the lemma.
It remains to prove the bound (2.6) for the Rankin–Selberg L-function LRS : Based on
the functional equation for Rankin–Selberg L-functions for convolutions of general holomorphic
cusp forms (see [Li79] and (2.13)–(2.15) below) and the Phragmén–Lindelöf principle, we show
30
The average distribution of primes represented by binary quadratic forms
that the following convexity bound holds for values of the analytic continuation of the Rankin–
Selberg L-function
LRS (s; χ1 , χ2 ) = L(2s, χq1 χq2 )
X
λχ1 (n)λχ2 (n)n−s
n>1
inside the critical strip:
b 1 (q1 )
Lemma 2.8. Let q1 , q2 6≡ 0 (mod 8) be two negative fundamental discriminants. Let χ1 ∈ H
b
and χ2 ∈ H1 (q2 ) be two distinct complex class group characters. For every ε > 0 we have
LRS (s; χ1 , χ2 ) ε
q1 q2
· (1 + |t|)2
(q1 , q2 )1−ε
1−σ+ε
for all s = σ + it with 0 6 σ 6 1 and t ∈ R. The implied constant is effectively computable.
Proof. We start the proof by gathering the notation needed to formulate the functional equation
[Li79, Theorem 2.2] for the Rankin–Selberg L-function LRS as it shows itself in our situation;
although it is not necessary to have a copy of that paper at hand, we suppose that it might
make following the proof a little bit easier. We will use the notation from [Li79] whenever there
is no clash with the basic notation that we have used so far in this work.
Let q be the least common multiple of q1 and q2 .7 Let M be the conductor of χq1 χq2 ; as this
[q1 ,q2 ]
. Write q = M M 0 M 00 such that every prime
is a real Dirichlet character, we have M = (q
1 ,q2 )
divisor p of M 0 also divides M and (M 00 , M M 0 ) = 1; hence M 0 = 1 and M 00 = (q1 , q2 ).8
Note that q is squarefree if q 6≡ 0 (mod 4); otherwise, 22 is the only proper prime power
dividing q and we then have either 4 | M 00 (if 4 divides both q1 and q2 ) or 4 | M . For each
prime divisor p of q, let R(p), R0 (p), R1 (p) and R2 (p) denote the largest power of p that divides
q, M 0 , q1 and q2 , respectively.9 Since no square of an odd prime can divide the levels of the
fundamental discriminants q1 and q2 , we have R(p) = p, R0 (p) = 1 and
(
Rj (p) =
p
1
if p divides qj ,
otherwise,
(j = 1, 2)
for each odd prime p that divides q. Moreover, we set
θ(s; p, χ1 , χ2 ) = 1 − χq1 q2 (p)λχ1 (p)λχ2 (p)p−s
for each odd prime p that divides M 00 and all s ∈ C. Finally, we have to check that the conditions
A)–C) on page 141 in [Li79] are satisfied. The conditions B) and C) are both only concerned
with prime divisors of M 0 and are therefore trivially satisfied in our case. Condition A) is
non-trivial if λχ1 (p) = λχ2 (p) = 0 for some prime divisor p of M 00 . But if p divides M 00 , then p
√
√
ramifies in both Q( q1 ) and Q( q2 ); thus pO(qj ) = p2j for some pj ∈ Z(qj ) for j ∈ {1, 2}. Hence
b j ). Therefore, Condition A) is also trivially satisfied.
λχj (p) = χj (pj ) 6= 0 for each χj ∈ H(q
We now quote the functional equation for LRS (s; χ1 , χ2 ) from [Li79, Theorem 2.2]: We have
Ψ(s; χ1 , χ2 ) = A(s; χ1 , χ2 )Ψ(1 − s; χ1 , χ2 ),
7
(2.13)
Note that q corresponds to the variable N in [Li79] although N is defined as the maximum of the levels q1
and q2 at the beginning of section 2 of that paper. We believe that this is a typographical error – which occurs
at multiple places in this paper – as N is used throughout the paper as the least common multiple of the levels
(see Example 2 on page 146, for example) and another interpretation of N would render the other definitions in
section 2 and the proofs in section 5 of [Li79] meaningless.
8
Note that this is only true because none of the two fundamental discriminants is an integer multiple of 8.
9
They are called Q, Q0 , Q1 , Q2 , respectively, in Li’s paper.
2.2 A large sieve inequality for complex ideal class group characters
31
where
Ψ(s; χ1 , χ2 ) = (2π)−2s Γ2 (s)θ(s; 2, χ1 , χ2 )
θ(s; p, χ1 , χ2 )−1 LRS (s; χ1 , χ2 )
Y
(2.14)
p | M 00
p6=2
and
A(s; χ1 , χ2 ) = c2 (s; χ1 , χ2 )
Y
B1 (p, χ1 , χ2 )p1−2s
p | M 00
p6=2
Y
B2 (p, χ1 , χ2 )R(p)1−s R1 (p)−s R2 (p)−s .
p|M
p6=2
(2.15)
Here B1 and B2 are functions which depend on p, χ1 and χ2 (but not on s) and always have
absolute value 1; moreover, θ(s; 2, χ1 , χ2 ) is a function that is bounded in the critical strip;
and c2 is a function satisfying c2 (s; χ1 , χ2 ) = c02 (χ1 , χ2 )41−2s for some function c02 with absolute
value 1 if q is even, and c2 is constantly 1 otherwise.
Remark. Note that Li’s general functional equation also contains a product over prime factors
of M 0 , which we have omitted since M 0 = 1 here. Moreover, the product over prime factors of
M 00 in (2.15) appears in [Li79] there with an exponent “r(p) − m(p)”, which equals 1 here for all
odd primes p that divide M 00 (this is because no proper power of an odd prime divides q1 and
q2 ); this also explains the definition of θ above if one compares it with the original definitions
of r(p), m(p) and θ on page 142 of [Li79].10
Set
e RS (s; χ1 , χ2 ) =
L
θ(s; p, χ1 , χ2 )−1 LRS (s; χ1 , χ2 ).
Y
p|M 00
e RS therefore
Since R1 (p)R2 (p) = p = R(p) for all odd primes p | M , the functional equation for L
reads
e RS (s; χ1 , χ2 ) = (2π)−2(1−s) Γ2 (1 − s)L
e RS (1 − s; χ1 , χ2 )
(2π)−2s Γ2 (s)L
· |q|
1−2s
Y
B3 (p, χ1 , χ2 )
p|q
with |B3 (p, χ1 , χ2 )| = 1 for all p | q. By the definition of the conductor ce(χ1 , χ2 ) of the L-function
e RS (s; χ1 , χ2 ), we get
L
1/2−s
ce(χ1 , χ2 )
e RS (s; χ1 , χ2 )(2π)−2s Γ2 (s)
L
=
= |q|1−2s ;
e RS (1 − s; χ1 , χ2 )(2π)−2(1−s) Γ2 (1 − s) L
2
(q1 q2 )
see [IK04, p. 94], for example. Hence ce(χ1 , χ2 ) = q 2 = [q1 , q2 ]2 = (q
2 . By the Phragmén–
1 ,q2 )
Lindelöf principle (see [Gol06, Theorem 8.2.3]), for example), we therefore have the convexity
bound
1−σ+ε
q1 q2
2
e
LRS (s; χ1 , χ2 ) ε
· (1 + |t|)
(q1 , q2 )
for all ε > 0 and all s = σ + it with 0 6 σ 6 1 and t ∈ R. In order to bound LRS (s; χ1 , χ2 ) it
Q
remains to find an upper bound for the product p|M 00 θ(s; p, χ1 , χ2 ). Note that p | M 00 implies
10
The complexity of the general functional equation for Rankin–Selberg L-functions for convolutions of holomorphic cusp forms displays the major drawback of considering these L-functions from the classical viewpoint and
not using the correspondence to L-functions of automorphic representations, which usually take a more natural
form (see [Mic07, §2.3] and the references there). The effort needed to apply this equation when q1 and q2 are
not fundamental discriminants seems disproportionate and one would certainly be well-advised to translate the
situation to the automorphic setting then.
32
The average distribution of primes represented by binary quadratic forms
√
√
both p | q1 and p | q2 , i.e. p ramifies in both Q( q1 ) and Q( q2 ). Thus, |λχ1 (p)| = |λχ2 (p)| = 1
for all p | M 00 . Hence,
Y
Y
θ(s; p, χ1 , χ2 ) (1 + p−s ) ε (M 00 )ε = ((q1 , q2 ))ε
p|M 00
p|M 00
p6=2
for all ε > 0 and all s = σ + it with 0 6 σ 6 1 and t ∈ R. This yields the stated bound for
LRS (s; χ1 , χ2 ).
Remark 2.9.
(a) Harcos and Michel [HM06, p. 582] mention that the bounds
(q1 q2 )2
(q1 q2 )2
6
c(χ
,
χ
)
6
1
2
(q1 , q2 )4
(q1 , q2 )
for the conductor of LRS (s; χ1 , χ2 ) can be derived using the local Langlands correspondence. This yields basically the same convexity bound as above.
(b) The Lindelöf Hypothesis (see [IK04, Corollary 5.20], for example) yields
LRS (1/2 + it; χ1 , χ2 ) ε c(χ1 , χ2 )ε (1 + |t|)2ε (q1 q2 )2ε (1 + |t|)2ε .
(2.16)
This gives SN (χ1 , χ2 ) N 1/2 Qε in (2.7) and therefore we have the following conditional
large sieve inequality:
Lemma 2.10. If the Lindelöf Hypothesis holds for Rankin–Selberg convolutions of holomorphic cusp forms of weight one, then
X
2
X
X
|an |2
an λχ (n) ε N (log N )3 + N 1/2 Q3/2+ε
b1 ([Q]) n6N
χ∈H
n6N
for all complex numbers (an )n6N and all ε > 0, Q > 1 and N > 3.
Given the fact that the essentially best-possible large sieve inequality for Dirichlet characters, Lemma 2.6, can be proved unconditionally, there is some reason to hope that it
might be possible to improve Lemma 2.7 without employing any kind of subconvexity
bounds for the involved L-functions.
Remark. The technique of the proof of Lemma 2.7 does not look promising for real class group
characters, which do not give a cusp form in (2.5) but an Eisenstein series. Indeed, we seem
to accumulate too many poles of the corresponding Rankin–Selberg L-function: If χ is a real
b 1 ), then λχ is the Dirichlet convolution χd ∗ χ 0 of two Dirichlet
class group character in H(q
d1
1
characters modulo the absolute values of two fundamental discriminants d1 and d01 such that
d1 d01 = q1 (compare Section 2.3.4); denote this character χ by χd1 ,d01 . The Rankin–Selberg
L-function
LRS (s; χd1 ,d01 , χd2 ,d02 ) = L(s, χd1 χd2 )L(s, χd1 χd02 )L(s, χd01 χd2 )L(s, χd01 χq20 )
has a pole at s = 1 whenever (d1 d01 , d2 d02 ) > 1. Hence, a term of size Qd+ε N n6N |an |2 accrues
for every discriminant q1 that shares a common factor with Qd other discriminants in F(Q). We
will therefore use a different method to cope with real class group characters.
P
2.2 A large sieve inequality for complex ideal class group characters
33
Remark. There exist other large sieve inequalities for algebraic number fields. For instance,
Schumer’s [Sch86] general inequality with explicit dependence of the constants on the parameters
of the underlying fixed field yields
X
b1 (q)
χ∈H
X
a∈Z(q)
N(a)6N
2
c(a)χ(a) (log |q|) |q| + |qN |1/2 + N
X
|c(a)|2
a∈Z(q)
N(a)6N
for any fixed q ∈ F and any function c on Z(q). However, the mean-value results of the next
sections consider situations where the underlying number fields vary and therefore also require
a large sieve inequality which has an extra averaging over the discriminant. To our knowledge,
Lemma 2.7 is the first large sieve inequality for varying number fields.
Similarly, there also exist other large sieve inequalities for modular forms of a fixed level (see
[Mic07, §3.1.3], for example).
In the proof of our variant of the Bombieri–Vinogradov theorem we will need the large sieve
inequality for complex class group characters in the following form:
Corollary 2.11. Let (an ) be a complex sequence with
and ε > 0. Then
b1 ([Q])
χ∈H
< ∞. Let Q > 1, k > 2, c >
1
2
2
X
X
−s −(k+1)
|an |2 (n1−2c + n1/2−2c Q5/2 ) 1 + (log n)3 .
a
λ
(n)n
|ds| ε Qε
n χ
|s|
Z
X
n>1 |an |
P
(c) n>1
n>1
(2.17)
Moreover, we have
Z
X
b1 ([Q])
χ∈H
2
X
X
−s |an |2 (n1−2c + n1/2−2c Q3/2 ) 1 + (log n)3
an λχ (n)n |s|−(k+1) |ds| ε Qε
(c) n>1
n>1
(2.18)
if the Lindelöf Hypothesis holds.
Proof. Let (bn ) be a complex sequence with
deduce
X
b1 ([Q])
χ∈H
n>1 |bn |
P
< ∞ and T > 1. Using Lemma 2.7, we
2
Z T X
X
it bn λχ (n)n dt ε T Qε
|bn |2 (n + n1/2 Q5/2 ) 1 + (log n)3
−T
n>1
(2.19)
n>1
as in [Bom87, Théorème 10]. Set bn = an n−c and T = 1 in (2.19). Then
2
Z c+i X
−s an λχ (n)n |s|−(k+1) |ds|
X
b1 ([Q])
χ∈H
X
=
c−i
n>1
2
Z 1 X
it bn λχ (n)n |c − it|−(k+1) dt
(2.20)
−1 n>1
b1 ([Q])
χ∈H
ε Qε
X
n>1
|an |2 (n1−2c + n1/2−2c Q5/2 ) 1 + (log n)3 .
34
The average distribution of primes represented by binary quadratic forms
Let m be a positive integer. Then, again by (2.19) with T = m + 1, we have
2
X
−(k+1)
−s |ds|
a
λ
(n)n
n χ
|s|
Z c−mi
X
c−(m+1)i n>1
b1 ([Q])
χ∈H
+
2
Z c+(m+1)i X
−(k+1)
−s |s|
|ds|
a
λ
(n)n
n χ
c+mi
ε
1
mk+1
1
mk+1
X
b1 ([Q])
χ∈H
X
b1 ([Q])
χ∈H
Z −m
n>1
2 2
X
Z m+1 X
it it bn λχ (n)n dt
bn λχ (n)n dt +
−(m+1) n>1
Z m+1
m
n>1
2
X
it b
λ
(n)n
n χ
dt
−(m+1) n>1
Qε X
2 1−2c
1/2−2c 5/2
3
|a
|
(n
+
n
Q
)
1
+
(log
n)
.
n
mk n>1
Since k > 2 we may sum over m from 1 to ∞ here. Together with (2.20) this leads to the
bound (2.17).
By Lemma 2.10, the exponents which are equal to 25 above may be replaced by 32 if we
assume the Lindelöf Hypothesis and this yields the bound (2.18).
2.3
Results of Bombieri–Vinogradov type
Being now equipped with the most important ingredient for proofs of average results on the
distribution of primes, we can now proceed to primes represented by positive definite binary
quadratic forms, for which we will prove that the error term in the corresponding prime number theorem, Theorem 1.2, is small on average over negative fundamental discriminants. The
admissible range for the discriminants will, however, be considerably smaller than in the original
Bombieri–Vinogradov theorem due to the comparatively weak large sieve result of the preceding
section.11
2.3.1
Statement and interpretation of results
In the first two results we will consider smoothed versions of the Chebyshev function for integers
represented by positive definite binary quadratic forms. We demonstrate that these functions
are “well distributed”12 with respect to the form classes of almost all negative fundamental
discriminants q 6≡ 0 (mod 8) with |q|10+ε 6 X(log X)−B , where ε > 0 is arbitrarily small and
B is some positive number which depends on the power of (log X) that we wish to save over
“trivial” bounds (see Remark 2.18). Moreover, we may even save a positive power of X, if we
confine ourselves to sets M (Q) of negative fundamental discriminants with absolute value less
than Q such that no (positive or negative) fundamental discriminant has many integer multiples
in M (Q).
11
After the defence of this thesis, the author noticed two rather simple means to improve this range (and
consequently also some of the results of Section 2.5): First, the inequality (2.17) can be improved by noting that
the large sieve inequality of Lemma 2.7 is worse than the “trivial” bound for the left-hand side of (2.2) if N is
small there. Second, the line Re(s) = 21 is not an optimal choice for the lines of integration inside the critical
strip of the integrals that arise from “Gallagher’s identity” (see (2.44)). We refer the reader to [Dit13] for details.
12
The notion of “well-distribution” is used here as a vague but suggestive description for the results of this
section (as suggested by the discussion in Section 1.2). We will discuss and specify this notion in Section 2.4.
35
2.3 Results of Bombieri–Vinogradov type
Definition 2.12. For any Q > 1, let M (Q) be a subset of F(Q), the set of all negative
fundamental discriminants q 6≡ 0 (mod 8) with |q| 6 Q. We say that ν ∈ [0, 1] is a divisor
frequency of M (Q) if it satisfies the property:
The cardinality of the set {q ∈ M (Q) : q 0 | q} is bounded by Qν for each
(positive or negative) fundamental discriminant q 0 with 1 < |q 0 | 6 Q.
(2.21)
For all X > 3, all q ∈ F, all C ∈ K(q) and all integers k > 0, we define
X k
1 X
w(C, n),
Λ(n) log
ψk (X; q, C) =
k!
n
(2.22)
n6X
where w(C, n) is given by (1.4) (see also (1.5) and (1.6)).
1
k
Remark. This type of smoothing with the weight factor k!
(log X
Riesz typical
n ) (so-called
P
means) arises from the inverse Mellin transform of the Dirichlet series n>1 Λ(n)w(C, n)n−s
with the kernel s−(k+1) ; see [MV07, §5.1], for example. We also recall that the factor w(C, n)
accounts for the fact that every positive integer n may usually be represented by forms from
different form classes of a given discriminant or by forms from no class at all; see Remark 1.8.
Let Q ≈ X 1/(15−5ν) and let M (Q) ⊆ F(Q) be a set with divisor frequency ν ∈ (0, 1]. We
will show that the smoothed and weighted Chebyshev functions ψk (X; q, C) are well distributed
with respect to the form classes to most discriminants q ∈ M (Q) and the error term is at most
Qν/2
−A (with A > 0 arbitrarily large) for most q ∈ M (Q):
|M (Q)| · X(log X)
Theorem 2.13. Let M (Q) ⊆ F(Q) for some Q > 1 and let ν ∈ (0, 1] be a divisor frequency
of M (Q). For every integer k > 2, every (arbitrarily large) real number A > 0 and every
(arbitrarily small) real number ε > 0, there exists a real number B = B(A) such that
X
1
max max ψk (Y ; q, C) −
ψk (Y ; q, K) Qν/2 X(log X)−A
h(q) K∈K(q)
C∈K(q) Y 6X
q∈M (Q)
X
(2.23)
for Q(15−5ν)+ε 6 X(log X)−B . The implied constant depends on ε, A, k and ν; the dependence
on ε is effective, the dependence on A, k and ν is non-effective. The constant B is explicitly
computable; in particular, one may choose B = 10A + 190.
If the set M (Q) is composed of (odd) negative prime discriminants, then M (Q) has divisor
frequency ν = 0. In this case we just fail to achieve (2.23) with ν = 0. Nevertheless, it is worth
recording that the proof of Theorem 2.13 yields
Theorem 2.14. Let Q > 1 and let Π(Q) be the set of all odd negative prime discriminants
whose absolute value is at most Q. For every integer k > 2 and every (arbitrarily small) real
number ε > 0, we may find an absolute constant B such that
X
q∈Π(Q)
1
max max ψk (Y ; q, C) −
h(q)
C∈K(q) Y 6X
X
K∈K(q)
ψk (Y ; q, K) ε,k X(log X)k+3
for Q15+ε 6 X(log X)−B .
2
To put this last result into perspective, we consider the form fq (x, y) = x2 + xy + 1−q
4 y for
each negative fundamental discriminant q ≡ 1 (mod 4). Note that fq lies in the principal class
C0 (q) of K(q) then and consider the function
Sq (X) =
X
p6X
∃x,y∈Z: fq (x,y)=p
X
log(p) log
p
2
,
36
The average distribution of primes represented by binary quadratic forms
which gives a smoothed and weighted count of the primes represented by fq up to X. By
Remark 1.9, we have
Sq (X) =
X
Λ(n) log
n
X
1
2
n6X
∃x,y∈Z: fq (x,y)=n
2
w(C0 , n) + O(X 1/2 (log X)3 )
for all negative fundamental discriminants q ≡ 1 (mod 4) with |q| 6 X. Thus, Theorem 2.14
implies that, for most negative prime discriminants q with |q| 6 Q ≈ X 1/15 , the function Sq (X)
deviates from the (expectable) average function
X
1
e(K)
2h(q) K∈K(q)
X
log(p) log
p
X
p6X
p∈R(q,K)
2
,
where the factor e(K) was defined in (1.7), by only a small amount at most – and the sum (over
q ∈ Π(Q)) of these discrepancies is a positive power of X smaller than “trivial” estimates can
guarantee. We will quantify this improvement in Remark 2.18.
Unfortunately, this saving of a power of X is not possible for the analogous smoothed and
weighted count of primes represented by the forms of the shape x2 + ny 2 because the divisor
−4 of the corresponding discriminants −4n occurs too often, i.e. we have to choose ν = 1 (if we
do not want ν to depend on Q) in condition (2.21) and therefore only save a power of (log X).
If ν < 1, then it does not seem to be possible to unsmooth these results, i.e. to take k = 0,
while keeping the given estimates. This is because the unsmoothing process (see Section 2.3.6)
produces a term of size Q1/2 X(log X)−D (where D is an arbitrary positive number).
However, for ν = 1, i.e. for arbitrary sets M (Q) of negative fundamental discriminants, these
extra terms of size Q1/2 X(log X)−D are not too large and we can obtain from Theorem 2.13 the
following result, which has more resemblance to the original Bombieri–Vinogradov theorem:
Theorem 2.15. For all q ∈ F and all C ∈ K(q), define
ψ(X; q, C) =
X
Λ(n).
n6X
n∈R(q,C)
Let F(Q) be the set of all negative fundamental discriminants q 6≡ 0 (mod 8) with |q| 6 Q. Let
A > 0 be an arbitrarily large real number, let ε > 0 be an arbitrarily small real number and let
e(C) be defined by (1.7). Then there exists a real number B = B(A) such that
X
q∈F(Q)
Y
ε,A Q1/2 X(log X)−A
max max ψ(Y ; q, C) −
e(C)h(q) C∈K(q) Y 6X
for Q10+ε 6 X(log X)−B .
By standard methods, the function ψ(X; q, C) can be replaced by the prime counting
function
π(X; q, C) = |{p 6 X prime | ∀f ∈ C : f (x, y) = p for some x, y ∈ Z}|.
This lets us interpret Theorem 2.15 as an average result for the error term in the prime number
theorem for primes represented by positive definitive binary quadratic forms and therefore as
an analogue of Theorem 2.3:
37
2.3 Results of Bombieri–Vinogradov type
Theorem 2.16. Let A > 0 be an arbitrarily large real number, let ε > 0 be an arbitrarily small
real number and let e(C) be defined by (1.7). Then there exists a real number B = B(A) such
that
X
li(X) 1/2
−A
max π(X; q, C) −
Q X(log X)
e(C)h(q)
C∈K(q)
q∈F(Q)
for Q10+ε 6 X(log X)−B . The implied constant depends on ε and A; the dependence on ε is
effective, the dependence on A is non-effective. The constant B is explicitly computable; in
particular, one may choose B = 40A + 220.
Remark. Recall here that the weight e(C) equals w(C, n) whenever n is a prime that does not
divide q and it is therefore also the weight factor in the asymptotic behaviour of π(X; q, C) that
is necessary to compensate for the distinct behaviour of ambiguous and non-ambiguous classes
in the representability of primes.
Under the assumption of the Lindelöf Hypothesis, we have seen in Section 2.2 that a stronger
large sieve inequality holds for complex class group characters. It yields an increased range for
the fundamental discriminants in the statements above:
Theorem 2.17. Assume the Lindelöf Hypothesis for Rankin–Selberg convolutions of holomorphic cusp forms of weight one (or more specifically: assume that estimate (2.16) holds
for all pairs of distinct complex class group characters). Then
(a) Theorem 2.13 holds with Q6−3ν+3ε 6 X(log X)−B if ν >
X(log X)−B if ν 6 12 + ε;
1
2
+ ε and with Q7−5ν+5ε 6
(b) Theorem 2.14 holds with Q7+ε 6 X(log X)−B ;
(c) Theorems 2.15 and 2.16 hold with Q3+ε 6 X(log X)−B .
We will justify these conditional improvements in Remarks 2.26 and 2.27.
Remark 2.18. How do these results compare to “trivial” estimates? There is no upper bound
for the number of primes p 6 X represented by a given binary quadratic form which is as trivial
as the estimate π(X; q, a) 6 Xq + 1 for primes in arithmetic progressions, where the right-hand
side of the inequality is simply the number of integers in the given arithmetic progression (or
this number plus one). However, the bound
X
n6X
n∈R(q,C)
X
1 p + X 1/2
|q|
(see (1.14)) can be proved by elementary means and may therefore be considered as a suitable
substitute for a completely trivial bound. This estimate and (1.6) give the bound
k+1
O X(log X)
1
p
|q|
q∈M (Q)
X
(2.24)
for the left-hand side of (2.23) in Theorem 2.13. If M (Q) = F(Q) (note that |F(Q)| Q), we
beat this bound by an arbitrary power of (log X).
As for Theorem 2.15 and Theorem 2.16, it is known that the class number h(q) has the
lower bound
|q|1/2 (log |q|)−1 h(q)
(2.25)
38
The average distribution of primes represented by binary quadratic forms
if the primitive real Dirichlet character modulo q is not exceptional, and |q|1/2−ε ε h(q) for
all ε > 0 if it is exceptional; see Section 2.3.2. Thus, we get the “trivial” bound
Oε Q1/2+ε X(log X)
in Theorem 2.15. However, since exceptional discriminants are very rare (see Proposition 2.20),
it is reasonable to use (2.25) and aim to improve on the “trivial” bounds
O Q1/2 (log Q)X(log X)
for Theorem 2.15 and
O Q1/2 (log Q)X
for Theorem 2.16 – which we indeed improve by an arbitrary power of (log X), just as in the
original Bombieri–Vinogradov theorem.
As we have already mentioned above, we do even better if ν < 1 because we then beat the
bound
Q 1/2 k+1
O X(log X)
·
,
(2.26)
log Q
which (2.24) yields for M (Q) = Π(Q), for example, by a positive power of Q (and therefore by
a positive power of X): Let X be large, Q = X 1/16 and k = 2; then Theorem 2.14 beats (2.26)
1/2
by a factor of size logQQ
(log X)−2 ε X 1/32−ε for all arbitrarily small ε > 0. This result
is unusual as it does not seem to be possible to achieve a saving of a positive power of X over
the trivial bound for the corresponding smooth version of the original Bombieri–Vinogradov
theorem.
Something that we do not achieve is to prove the conditional error term in Theorem 1.16 “on
average” as the original Bombieri–Vinogradov theorem is capable to do for primes in arithmetic
progressions. Due to the shorter range for q in Theorem 2.16, we may only deduce:
Corollary 2.19. Let X > 3, A > 0 and ε > 0. Let Q satisfy Q10+ε = X(log X)−B with
19+2ε
B = B(A) as in Theorem 2.16. Set d = 20+2ε
. Then there exist constants D1 = D1 (A, ε) > 0
and D2 = D2 (A, ε) > 0 such that
π(X; q, C) =
li(X)
+ Oε,A X d (log X)−D1
e(C)h(q)
for all form classes of all negative fundamental discriminants q 6≡ 0 (mod 8) satisfying |q| 6 Q,
with the possible exception of at most Q(log X)−D2 discriminants in this range.
Remark. One reason for the comparatively short ranges that are, for now, admissible for the
discriminants in our results may be found in the fact that the size of a form class group is much
smaller than the corresponding discriminant. This offers therefore less potential for possible
cancellation effects than in the case of arithmetic progressions where the number of reduced
residue classes of a modulus is usually only slightly smaller than the modulus itself.
In order to prove Theorems 2.13 and 2.14, and in consequence also for the other statements
of this section, we will largely follow Gallagher’s proof of the original Bombieri–Vinogradov
theorem as presented by Bombieri in [Bom87, §7]. The key ingredients will be:
(i) The large sieve inequality for complex ideal class group characters, which we have found
in Section 2.2; see Section 2.3.3.
39
2.3 Results of Bombieri–Vinogradov type
(ii) The original Bombieri–Vinogradov theorem itself, which we will use to estimate the contribution coming from real ideal class group characters; see Section 2.3.4.
(iii) Landau’s theorem on the scarcity of exceptional moduli, that is, the rarity of integers q
for which there could possibly exist a Dirichlet character χ modulo q whose associated
L-function has a Landau–Siegel zero; see Proposition 2.20 in the next subsection.
(iv) A result of Siegel–Walfisz type for ideal class group characters by Goldstein; see
Lemma 2.22 in the next subsection. The use of this result and the original Siegel–Walfisz
theorem account for the ineffectivity of all results in this chapter.
We choose to follow Gallagher’s method as it is sufficient for the investigation of primes
represented by binary quadratic forms and it is slightly easier (to this author’s mind) than
the “modern” proof of the original Bombieri–Vinogradov theorem as presented in [IK04, §17]
and [Bom87, §12]. The modern proof is characterized by a more systematic use of bilinear
forms and it is thus capable to yield more general results on equidistribution in arithmetic
progressions (see [IK04, Theorem 17.4], for example). We will get an impression of such results
in Section 2.4, where we prove a result on the mean square distribution (over form classes)
for more general arithmetic functions than the prime counting function and (smooth) versions
of the Chebyshev function, which we consider in the theorems above. However, a comparison
between our Theorem 2.30 and [IK04, Theorem 17.5], for example, shows that our restrictions
on these functions are quite severe and an analogous version of [IK04, Theorem 17.4] might
therefore be quite inconvenient to use.
Nevertheless, it should be possible to fit the ingredients (i)-(iv) that we mentioned above
into a proof of Theorem 2.15 that follows [IK04, §17]. It is less clear whether, by this method,
we could also achieve its smooth version, which saves a positive power of X (or Q) in certain situations as in Theorem 2.14: In the modern proof one splits certain bilinear sums into
small boxes and “excess areas”; the latter are trivially estimated (and no other estimate seems
attainable), which prevents at that point the saving of a positive power of X.
2.3.2
Preliminaries
Let A > 0 (arbitrarily large) and ε > 0 (arbitrarily small; for simplicity we will also assume
ε 6 14 in the end) be real numbers; let k > 2 be an integer; let M (Q) ⊆ F(Q) be a set of negative
fundamental discriminants q 6≡ 0 (mod 8) with divisor frequency ν ∈ [0, 1]. These numbers will
be considered as fixed parameters which the implied constants in the estimates of this and
the subsequent subsections may depend on; the constants will not depend on the positive real
numbers X and Q, for which we always assume that Q 6 X.
We have already mentioned in Section 1.2 that questions on primes that are represented by
binary quadratic forms are usually best dealt with by looking at a similar problem for ideals
in the corresponding quadratic fields. The weights w(C, n) in the definition of the functions
ψk (X; q, C) already put these functions into the right form for this transition. Indeed, by
definition (1.4) of w(C, n), we have
1
ψk (X; q, C) =
k!
X
a∈Bq (C)∩Z(q)
N(a)6X
X
Λ(N(a)) log
N(a)
k
.
for all q ∈ F(Q) and all C ∈ K(q). For ease of notation we set
1
Ek (X; q) = max max ψk (Y ; q, C) −
h(q)
C∈K(q) Y 6X
X
K∈K(q)
ψk (Y ; q, K).
(2.27)
40
The average distribution of primes represented by binary quadratic forms
Thus, if the assumptions of Theorem 2.13 and Theorem 2.14 hold, we have to prove the bounds
X
Ek (X; q) Qν/2 X(log X)−A
(2.28)
q∈M (Q)
if ν > 0 and Q(15−5ν)+ε 6 X(log X)−B(A) , and
X
Ek (X; q) X(log X)k+3
(2.29)
q∈Π(Q)
if Q15+ε 6 X(log X)−B .
We start the proof of both (2.28) and (2.29) by using the orthogonality property of the ideal
class group characters. We know from Section 1.2 that H(q) is a finite abelian group and so is
b
the group of ideal class group characters H(q)
' H(q). Define
k
1 X
Y
Λ(N(a))χ(a) log
ψk (Y ; q, χ) =
k!
N(a)
a∈Z(q)
N(a)6Y
b
for all Y 6 X, all q ∈ F(Q), all χ ∈ H(q)
and all k > 0.
The orthogonality property of the characters of finite abelian groups (see [IK04, §3.1], for
example) yields
ψk (Y ; q, C) =
X
Λ(N(a)) log
N(a)
X
a∈Z(q)
N(a)6Y
k
!
X
1
χ(Bq (C))χ(a)
h(q)
b
χ∈H(q)
(2.30)
X
1
=
χ(Bq (C))ψk (Y ; q, χ)
h(q)
b
χ∈H(q)
for all q ∈ F(Q) and all C ∈ K(q). Together with the triangle inequality we thus get
X
X
1
|ψk (Y ; q, χ)| .
Y 6X
h(q)
(q)
q∈M (Q)
Ek (X; q) 6 max
q∈M (Q)
X
(2.31)
χ6=χ0
Remark. Here we have to ignore the possibility that the sum on the right side of (2.30) is
presumably cancelling. This is a defect which is also characteristic of all proofs of the original
Bombieri–Vinogradov theorem and presumably the superficial reason why it is not possible to
extend its range beyond the square-root barrier.
As before, for every fundamental discriminant q 6≡ 0 (mod 8), we let χq denote the unique
primitive real Dirichlet character modulo |q|. By Siegel’s theorem (see [MV07, Theorem 11.14],
for example), we have the unconditional, non-effective lower bound |q|−ε ε L(1, χq ) for the
corresponding Dirichlet L-function. This yields the lower class number bound
|q|1/2−ε ε h(q)
(2.32)
by Dirichlet’s class number formula (2.12). Yet, there exists a better bound for many q and it
turns out that the contribution from the other discriminants is often negligible: We know (see
[MV07, Theorem 11.3]) that there exists an absolute constant c1 > 0 such that, for any q ∈ F,
the Dirichlet L-function L(s, χq ) has at most one zero in the set
n
s = σ + it ∈ C : σ > 1 −
o
c1
.
log |q|(|t| + 4)
41
2.3 Results of Bombieri–Vinogradov type
The potential only zero in this region (for a fixed admissible value of c1 ) is called the exceptional
zero or Landau–Siegel zero for the modulus |q| and χq is then called an exceptional character; it is
conjectured that no such zero and character exist for any modulus. Moreover, there exists c2 > 0
such that L(1, χq ) > c2 (log |q|)−1 if L(s, χq ) has no exceptional zero (see [MV07, Theorem 11.4]).
Thus, by the class number formula (2.12), there exists c3 > 0 such that
|q|1/2 (log |q|)−1 6 c3 h(q)
(2.33)
holds for all q ∈ F for which L(s, χq ) has no exceptional zero. We fix such a value of c3 .
We now give an upper bound for the contribution to the right side of (2.31) coming from
the (presumably empty) set Fex (Q) ⊂ F(Q) of exceptional fundamental discriminants; here we
call q ∈ F exceptional if it fails to satisfy (2.33) for the fixed value of c3 (and therefore L(s, χq )
has an exceptional zero then).
Proposition 2.20. Let Mex (Q) = Fex (Q) ∩ M (Q) be the (possibly empty) subset of exceptional
fundamental discriminants of M (Q). Then we have
max
Y 6X
1
h(q)
(Q)
X
q∈Mex
X
|ψk (Y ; q, χ)| X(log X)k+3 .
(q)
b
χ∈H(q)r{χ
0 }
In particular, exceptional discriminants contribute acceptably to the right side of (2.31) if either
ν > 0 and Q > (log X)(2A+2k+6)/ν or ν = 0.
Remark 2.21. The case Q < (log X)(2A+2k+6)/ν will be dealt with later on by means of an
appropriate Siegel–Walfisz type theorem; see Remark 2.23 below. Moreover, note that if ν = 0,
then this contribution would not be negligible in Theorem 2.13, which is why we get the slightly
weaker bound in Theorem 2.14.
Proof. Let q1 be an exceptional modulus. By a theorem of Landau (see [MV07, Corollary 11.9]),
we know that there cannot exist an exceptional modulus q with q1 < q < q12 . Thus, there can be
at most O(log log Q) exceptional moduli which are smaller than Q. Using standard estimates
(see (2.8)), we also have
X
X
Y k X
ψk (Y ; q, χ) 6 Λ(n) log
log(n)τ (n) X(log X)k+2
χ(a) 6 (log X)k
n6Y
n
n6X
a∈Z(q)
N(a)=n
b
for all q ∈ Mex (Q) and all χ ∈ H(q),
and the first assertion follows immediately.
(2A+2k+6)/ν
If ν > 0 and Q > (log X)
, then
X(log X)k+3 6 Qν/2 X(log X)−A ,
i.e. the contribution from exceptional discriminants is acceptable for Theorem 2.13.
Therefore it remains to estimate the contribution from non-exceptional discriminants on the
right side of (2.31), i.e. we have to bound
X
1
|ψk (Y ; q, χ)| ,
Y 6X
h(q)
0
(q)
q∈M (Q)
max
X
χ6=χ0
where
M 0 (Q) = M (Q) r Mex (Q)
(2.34)
42
The average distribution of primes represented by binary quadratic forms
or
M 0 (Q) = Π(Q) r Mex (Q),
and we will show that it is bounded above by
Qν/2 X(log X)−A
(2.35)
for both ν > 0 and ν = 0.
If Q is very small, a uniform bound for ψ0 (X; q, χ) exists, which easily yields the desired
bound for (2.34); the following is a special case of Goldstein’s generalization of the Siegel–Walfisz
theorem [Gol70]:
Lemma 2.22 (Goldstein). Suppose that q ∈ F with |q| 6 (log X)D for some positive constant D.
Then
ψ0 (X; q, χ) D X(log X)−2D
b
for all non-trivial class group characters χ ∈ H(q).
The implied constant does not depend on q
or χ, but is ineffective.
So suppose that Q = (log X)D for some D > A + k. We have
Z Y
ψk (Y ; q, χ) =
ψk−1 (t; q, χ)
1
dt
max |ψ0 (y; q, χ)| · (log Y )k .
y6Y
t
Summing over q ∈ M 0 ((log X)D ), Lemma 2.22 therefore yields the upper bound (2.35) for (2.34)
if Q = (log X)D .
Remark 2.23. We have now proved that the bounds in both Theorem 2.13 and Theorem 2.14
hold for Q 6 (log X)D =: Q0 and it remains to bound (2.34) with M 0 (Q) replaced by
M 00 (Q) := M 0 (Q) ∩ {q : |q| > Q0 }
for a value of D that we will choose later; compare (2.67). We already record that, because of
Remark 2.21, we must choose D at least as large as D1 := (2A + 2k + 6)/ν if ν > 0. If ν = 0,
we will have to choose some D > D1 := A + k to guarantee the bound (2.35) for (2.34) (which
is more than enough for Theorem 2.14).
We recall that we have defined in Remark 1.17 the numbers λχ (n) and the class group
L-functions L(s, λχ ) for all class group characters. The expansion of the logarithmic derivative
of such an L-function is given by
X
L0
−s
e
(s, λχ ) = −
Λ(a)χ(a)N(a)
,
L
a∈Z(q)
where
(
e
Λ(a)
=
log N(p)
0
if a = pm for some prime ideal p ∈ Z(q) and some integer m,
otherwise.
Similarly to the computation of the inverse Mellin transform of
(5.22)], for example) that
−
1
2πi
Z
(c)
L0
L,
one can show (see [MV07,
k
L0
1 X e
Y
(s, λχ )Y s s−(k+1) ds =
Λ(a)χ(a) log
=: ψek (Y ; q, χ)
L
k!
N(a)
a∈Z(q)
N(a)6Y
(2.36)
(2.37)
43
2.3 Results of Bombieri–Vinogradov type
for all c > 1. This does not equal ψk (Y ; q, χ), but we miss it only by a negligible margin: Set
c(a) = χ(a) log
Y
N(a)
k
and note that we have
k! ψek (Y ; q, χ) =
X X
X
(log p)c(p) +
p6Y p∈Z(q)
N(p)=p
=
X X
X
(log p2 )c(p) +
X
X
log(N(p))c(p` )
`>2 p∈Z(q)
N(p)` 6Y
p6Y 1/2 p∈Z(q)
N(p)=p2
(log p)c(p) + O(Y 1/2 (log Y )k+2 )
p6Y p∈Z(q)
N(p)=p
and
k! ψk (Y ; q, χ) =
X X
X
(log p)c(a)
`>1 p6Y 1/` a∈Z(q)
N(a)=p`
=
X X
(log p)c(p) +
p6Y p∈Z(q)
N(p)=p
=
X X
X X
X
(log p)
`>2 p6Y 1/`
c(a)
a∈Z(q)
N(a)=p`
(log p)c(p) + O(Y 1/2 (log Y )k+3 ).
p6Y p∈Z(q)
N(p)=p
Hence
ψk (Y ; q, χ) = ψek (Y ; q, χ) + O(Y 1/2 (log Y )k+3 ).
(2.38)
Summing over q ∈ M 00 (Q), the contribution of the remainder terms is QX 1/2 (log X)k+3
in (2.34) if we replace ψk (Y ; q, χ) by ψek (Y ; q, χ) there. But this is negligible in (2.28) and (2.29).
Thus it remains to estimate
X
1
|ψek (Y ; q, χ)|.
Y 6X
h(q)
00
q∈M (Q)
max
X
(2.39)
b
χ∈H(q)
(q)
χ6=χ0
Next, we split (2.39) into
1
Y 6X
h(q)
q∈M 00 (Q)
max
X
X
b
χ∈H(q)
(q)
2
χ 6=χ0
1
Y 6X
h(q)
q∈M 00 (Q)
|ψek (Y ; q, χ)| + max
X
X
|ψek (Y ; q, χ)|
(q)
b
χ∈H(q)r{χ
0 }
(q)
χ2 =χ0
(2.40)
= Ek0 (Q, X) + Ek00 (Q, X),
say, i.e. we split it into sums over complex class group characters and sums over real class group
characters. We will estimate both terms separately in the next two subsections and show that
they are both bounded above by (2.35). Together with the results for exceptional discriminants
(Proposition 2.20) and small discriminants (Remark 2.23) we may then conclude that (2.28)
and (2.29) hold.
44
The average distribution of primes represented by binary quadratic forms
Remark 2.24. The only other work known to the author that considers a somewhat related
average over fundamental discriminants and class group characters is [FI03a]: Let φ be a smooth
even test function on R whose Fourier transform φb is compactly supported. On the assumption
of the Generalized Riemann Hypothesis for class group L-functions, Fouvry and Iwaniec consider
X X γχ
1
B(q; φ) :=
φ
(log |q|) ,
h(q)
2π
γχ
b
χ∈H(q)
where the inner sum is over the imaginary parts of the non-trivial zeros ρ =
They conjecture that
Z ∞
B(q; φ) =
φ(x) 1 −
−∞
sin 2πx
2πx
1
2
+ iγχ of L(s, λχ ).
dx + o(1)
(2.41)
as q → −∞ in the set of fundamental discriminants. In fact, they verify the conjecture when
φb is supported in [−1, 1] and then go on to establish it for almost all negative fundamental
discriminantsif φb is supported
in a wider interval, i.e. beyond the discontinuities of the Fourier
sin 2πx
at ±1:
transform of 1 − 2πx
Theorem (Fouvry–Iwaniec, [FI03a, Theorem 1.2]). Assume the Generalized Riemann Hypothesis for class group L-functions and Dirichlet L-functions. Let 0 < ϑ < 43 and let φ be a test
function with φb compactly supported in (−ϑ, ϑ). Let Q > 3 and let M (Q) be a set of squarefree
negative integers q ≡ 1 (mod 4) with Q < −q 6 2Q and |M (Q)| > Qϑ−1/3 . Then
X
q∈M (Q)
2
Z ∞
sin 2πx
B(q; φ) −
ϑ,φ |M (Q)| · log log Q .
φ(x)
1
−
dx
2πx
log Q
−∞
It is worth noting that the proof of their result draws heavily on the correspondence between
the density conjecture (2.41) and the distribution of primes p of the form 4p = x2 − qy 2 for
negative fundamental discriminants q.
2.3.3
Complex class group characters
In this section, we estimate the first term Ek0 (Q, X) in (2.40). Using dyadic decomposition and
the class number bound (2.33) for the discriminants in M 00 (Q), we get
Ek0 (Q, X) (log X)2 max
max
Y 6X Q0 6Q1 6Q
−1/2
Q1
X Z L0
(s, λχ )Y s s−(k+1) ds
(c) L
q∈M 00 (Q1 ) χ∈H(q)
b
X
(2.42)
χ2 6=χ0
for all c > 1. Like in Section 2.2, we set
(q)
b 1 (q) = {χ ∈ H(q)
b
H
| χ2 6= χ0 }
for all q ∈ M 00 (Q1 ) and
b 1 ([Q1 ]00 ) =
H
[
b 1 (q).
H
q∈M 00 (Q1 )
Moreover, we let aχ (n) denote the coefficients of the L-series of the logarithmic derivative of
L(s, λχ ), i.e.
X aχ (n)
L0
(s, λχ ) =
L
ns
n>1
45
2.3 Results of Bombieri–Vinogradov type
and split it according to Bombieri’s modification of Gallagher’s identity: For every 1 6 z 6 X,
we set
Fz := Fz (s, λχ ) :=
X aχ (n)
ns
n6z
,
Gz := Gz (s, λχ ) :=
X aχ (n)
ns
n>z
,
Mz := Mz (s, λχ ) :=
X bχ (n)
n6z
ns
,
where the coefficients bχ (n) are the coefficients of L(s, λχ )−1 . Then
L0
= Gz (1 − LMz ) + Fz (1 − LMz ) + L0 Mz .
L
(2.43)
Thus, for all c > 1, we have
Z
(c)
L0
Ys
(s, λχ ) k+1 ds =
L
s
Z
Gz (1 − LMz )
(c)
Ys
ds +
sk+1
Z
Fz (1 − LMz ) + L0 Mz
Ys
(c)
sk+1
ds.
We may move the line of integration of the second integral into the critical strip because Fz
b 1 ([Q1 ]00 ) (see
and Mz are Dirichlet polynomials and L and L0 are entire functions for all χ ∈ H
[IK04, Theorem 14.17], for example). Repeatedly using the inequality 2|ab| 6 |a|2 + |b|2 , we
obtain
Z
X
L0
s −(k+1)
(s, λχ )Y s
ds
max
Y 6X
(c) L
b1 ([Q1 ]00 )
χ∈H
X
Z
X
c
b1 ([Q1 ]00 )
χ∈H
+X
1/2
(|Gz |2 + |1 − LMz |2 )|s|−(k+1) |ds|
(c)
X
(2.44)
Z
2
2
2
−(k+1)
(1 + |Fz | + |Mz | + |Fz Mz | )|s|
|ds|
(1/2)
+ X 1/2
b1 ([Q1 ]00 )
χ∈H
Z
X
b1 ([Q1 ]00 )
χ∈H
(|L|2 + |L0 |2 )|s|−(k+1) |ds|
(1/2)
for all c > 1. The first and second term on the right-hand side will be evaluated by our large
sieve inequality for complex class group characters, in particular by Corollary 2.11. Before we
can do this, we have to determine the coefficients aχ (n) and bχ (n) of Fz , Gz and Mz . This
is slightly more complicated than in the classical case, since if χ is a class group character in
b 1 (q), then the product
H
λχ (m)λχ (n) =
X
χq (d)λ(mnd−2 ),
(2.45)
d|(m,n)
where χq is the primitive real Dirichlet character modulo |q|, is not as simple as the product
of two values of a Dirichlet character (see [Iwa97, §6.6], for example; recall that the λχ (n) are
coefficients of primitive holomorphic cusp forms of weight one, level q and nebentypus χq , as
we already mentioned in Section 2.2). This product formula yields the Euler product
L(s, λχ ) =
Y
1 − λχ (p)p−s + χq (p)p−2s
−1
p
from which one easily deduces (see [KM97, Lemma 2.1]) that
L(s, λχ )−1 =
X
`,m>1
χq (m)µ(`)|µ(`m)|λχ (`)(lm2 )−s .
(2.46)
46
The average distribution of primes represented by binary quadratic forms
We thus get the following expressions for the Dirichlet series Mz , Fz , Gz and 1 − LMz :
X
Mz (s, λχ ) =
χq (m)µ(`)|µ(`m)|λχ (`)(lm2 )−s ,
(2.47)
(log k)χq (m)µ(`)|µ(`m)|λχ (`)λχ (k)(klm2 )−s ,
(2.48)
`,m>1
`m2 6z
X
Fz (s, λχ ) = −
k,`,m>1
k`m2 6z
X
Gz (s, λχ ) = −
(log k)χq (m)µ(`)|µ(`m)|λχ (`)λχ (k)(klm2 )−s ,
(2.49)
k,`,m>1
k`m2 >z
X
1 − LMz (s, λχ ) = −
χq (m)µ(`)|µ(`m)|λχ (`)λχ (k)(klm2 )−s .
(2.50)
k,`,m>1
`m2 6z
k`m2 >z
These series are not yet in the right form for a direct application of Corollary 2.11, but the
following (in)equalities will bring them into the right shape:
Lemma 2.25. For all positive integers ` and m, let A(`, m) be a complex number.
P
(a) Let α > 0. Assume that `>1 A(`, m)`−(1+α)+it m for all t ∈ R. Then
2
2
X
X
X
m−3−2α A(`, m)`−(1+α)+it .
A(`, m)(`m2 )−(1+α)+it α−1
m>1
`,m>1
(2.51)
`>1
(b) Assume that `>1 A(`, m)`−1/2+it < ∞ for all m > 1 and all t ∈ R. Moreover, assume
that there exists a real number M such that A(`, m) = 0 for all m > M and all ` > 1.
Then
P
2
2
X
X
X
−1 −1/2+it 2 −1/2+it m A(`, m)`
A(`, m)(`m )
.
(log M )
m6M
`,m>1
(2.52)
`>1
b 1 (q) and j1 , j2 , j3 > 1. Then
(c) Let χ ∈ H
X
A(`, m)λχ (`)λχ (m)(`m)−s =
`,m>1
`m>j1
X
χq (d)
X
A(vd, wd)λχ (h)(hd2 )−s
(2.53)
v,w>1
vw=h
h,d>1
hd2 >j1
and
X X
A(`, m)λχ (`)λχ (m)(`m)−s =
`6j2 m6j3
X
χq (d)
h,d>1
X X
j
v6 d2
A(vd, wd)λχ (h)(hd2 )−s
j
w6 d3
vw=h
(2.54)
for all s ∈ C for which the series converge.
Proof. By the Cauchy–Schwarz inequality, we have
X
2
X
2 X
X
2 −s 2(r−Re(s))
−2(r+Re(s)) −s A(`, m)(`m ) 6
m
m
A(`, m)` `,m>1
m>1
m>1
(2.55)
`>1
for all real numbers r and all complex numbers s for which the sums on the right side converge.
The first bound follows for r = 21 and s = (1 + α) − it.
As for the second bound, the sums on the right side of (2.55) are then only over m 6 M ;
the bound follows for r = 0 and s = 12 − it.
The equalities in (c) follow from (2.45).
47
2.3 Results of Bombieri–Vinogradov type
Remark. These (in)equalities have been used in [KM97, §7] to prove a zero-density estimate
for L-functions associated to certain cusp forms. The first proofs of the Bombieri–Vinogradov
theorem relied heavily on zero-density estimates for Dirichlet L-functions; Gallagher’s simplification of these proofs then removed any direct appeal to the zeros but still kept the core of the
argument. Thus, it is not surprising that Lemma 2.25 plays a role both here and in [KM97].
Now we set α = (log X)−1 and c = 1 + α, then apply (2.51) and (2.53) to (2.49) and obtain
2
|Gz (c + it, λχ )| (log X)
X
m
X
2
−(c+it) (log k)µ(`)|µ(`m)|λχ (`)λχ (k)(kl)
−3−2α m>1
k,`>1
k`> z2
m
= (log X)
X
m>1
−3−2α m
X
X
χq (d)
2
.
2 −(c+it) (log vd)µ(wd)|µ(wdm)|λχ (h)(hd )
v,w>1
vw=h
h,d>1
hd2 > z2
m
(2.56)
Set
a1 (h, d, m) =
X
(log vd)µ(wd)|µ(wdm)|
(2.57)
v,w>1
vw=h
and apply once again (2.51) to the right side of (2.56). This yields
|Gz (c + it, λχ )|2 (log X)2
X
m−3−2α
m>1
X
d>1
d−3−2α X
h>
z
m2 d2
2
a1 (h, d, m)λχ (h)h−(c+it) ,
which now has the right form to apply Corollary 2.11. We get
X
Z
|Gz (s, λχ )|2 |s|−(k+1) |ds|
(c)
b1 ([Q1 ]00 )
χ∈H
X
ε Qε1 (log X)2
(md)−3−2α
m,d>1
X
h>
5/2
|a1 (h, d, m)|2 (h−1−2α + h−3/2−2α Q1 )(log h)3 .
z
m2 d2
Since z α 6 X α 1 and
|a1 (h, d, m)|2 6 τ (h)2 (log hd)2
for all h, d and m, the contribution coming from |Gz |2 in (2.44) is bounded by
5/2
Oε X(log X)K1 Qε1 (1 + Q1 z −1/2 ))
(2.58)
for some K1 > 0; in fact, we may choose K1 = 11.
A comparison of (2.49) and (2.50) shows that the analysis of the contribution coming from
|1 − LMz |2 in (2.44) can be performed in almost exactly the same way and the same bound is
obtained. Thus we record that the whole first term on the right side of (2.44) can be bounded
by (2.58).
Moving on to the second line of (2.44), each summand in the integrand is again analysed
separately. The contribution coming from the integrand 1 follows directly from (2.10):
X 1/2
X
b1 ([Q1 ]00 )
χ∈H
Z
(1/2)
3/2
b 1 ([Q1 ]00 )| X 1/2 Q (log Q1 ).
1 · |s|−(k+1) |ds| X 1/2 |H
1
(2.59)
48
The average distribution of primes represented by binary quadratic forms
Next, Fz and Mz are bounded in the same way as Gz but with appeal to (2.52) (with M = z)
instead of (2.51) and (2.54) instead of (2.53). In fact, with a1 (h, d, m) given by (2.57), we find
Z
X
X 1/2
b1 ([Q1 ]00 )
χ∈H
ε X 1/2 (log z)2 Qε1
|Fz (s, λχ )|2 |s|−(k+1) |ds|
(1/2)
X
(md)−1
5/2
|a1 (h, d, m)|2 (1 + h−1/2 Q1 )(log h)3
X
(2.60)
h6 2z 2
m d
m,d6z
5/2
X 1/2 (log X)K2 Qε1 (z + Q1 z 1/2 )
for some K2 > 0; we may choose K2 = 10. Similarly,
X
Z
X
1/2
b1 ([Q1 ]00 )
χ∈H
ε X 1/2 Qε1 (log z)
|Mz (s, λχ )|2 |s|−(k+1) |ds|
(1/2)
X
m−1
m6z
X
`6
5/2
|µ(`)µ(`m)|2 (1 + `−1/2 Q1 )(log `)3
(2.61)
z
m2
5/2
X 1/2 (log X)K2 Qε1 (z + Q1 z 1/2 ).
The integrand |Fz Mz |2 requires a little bit more work, but the approach is familiar by now:
By (2.47), (2.48), (2.52) and (2.54), we have
Fz 1 + it, λχ Mz 1 + it, λχ 2
2
2
X
X
X
(log z)3
(mwd)−1 m,w,d6z
z
w2
v6
b6
2
z
(md)2
a2 (m, b, v, w)λχ (b)λχ (v)(bv)−(1/2+it) ,
where
a2 (b, d, m, v, w) = µ(v)|µ(vw)|a1 (b, d, m).
By (2.54) and (2.52), we then get
Fz 1 + it, λχ Mz 1 + it, λχ 2
2
2
X
2
X
a3 (h, r, d, m, w)λχ (h)h−(1/2+it) ,
(mwdr)−1 (log z)4
m,w,d,r6z
h>1
where
X
a3 (h, r, d, m, w) =
v0 6
z
w2 r
X
a2 (b0 r, d, m, v 0 r, w)
z
(md)2 r
0
0
v b =h
b0 6
whose absolute value is
|a3 (h, r, d, m, w)| 6
X
X
v0 6
z
w2 r
τ (b0 r)(log b0 rd) (log z) τ (r) τ3 (h),
z
(md)2 r
0
0
v b =h
b0 6
where τ3 (h) is the ternary divisor function (i.e., the number of ordered 3-tuples (b1 , b2 , b2 ) of
positive integers such that h = b1 b2 b3 ). By Corollary 2.11 and the bound
X
h6
z2
(mwdr)2
τ3 (h)2 z 2 (log z)8
,
(mwdr)2
49
2.3 Results of Bombieri–Vinogradov type
which follows by the method of [Kow04, p. 37], for example, we obtain
X
X
1/2
Z
b1 ([Q1 ]00 )
χ∈H
5/2
|Fz s, λχ Mz (s, λχ )|2 |s|−(k+1) |ds| ε X 1/2 (log X)K3 Qε1 (z 2 + Q1 z)
(1/2)
(2.62)
for some K3 > 0; we may choose K3 = 17.
We gather the bounds (2.59), (2.60), (2.61), (2.62) and record that the contribution to the
right side of (2.44) coming from the second line is
5/2
Oε X 1/2 (log X)K3 Qε1 (z 2 + Q1 z) .
(2.63)
It remains to bound the third term on the right side of (2.44). We could proceed as in
P
[Bom87], using the bound n6N λχ (n) ε (|q|2 N )1/2+ε that holds for Fourier coefficients of
weight-one cusp forms and therefore for our coefficients λχ as they arise from complex class
group characters here (see Proposition 5 in [HM06], for example). However, in our case it is
sufficient and easier to use the convexity bound for the functions L(s, λχ ): Each of them satisfies
a functional equation of the form
Φ(s, λχ ) = Φ(1 − s, λχ ),
where
p
Φ(s, λχ ) =
|q|
2π
(2.64)
s
Γ(s)L(s, λχ );
see [IK04, §22.3], for example. Therefore, the convexity principle of Phragmén–Lindelöf (see
[Gol06, Theorem 8.2.3]), for example) yields
L( 21 + it, λχ ) ε |q|1/2 (1 + |t|)
1/2+ε
,
for all t ∈ R. Combining the convexity principle for L(s, λχ ) and Cauchy’s inequality for the
derivative of analytic functions (consider the disc around 21 + it with radius (log Q1 )−1 ; see
[Rud87, Theorem 10.26], for example), we also get
−1
L0 (s, λχ ) ε |q|1/4+ε (1 + |t|)1/2+ε+(log Q1 ) (log Q1 ).
These bounds and (2.10) yield
X
b1 ([Q1 ]00 )
χ∈H
Z
(1/2)
(|L(s, λχ )|2 + |L0 (s, λχ )|2 )|s|−(k+1) |ds| ε (log Q1 )3 Q2+ε
1
(2.65)
for k > 2.
Remark. Duke, Friedlander and Iwaniec [DFI02, Theorem 2.6] proved the first subconvexity
bound for the L-functions associated to complex class group characters for all fundamental
discriminants (they had previously proved such a bound for special types of discriminants).
Subsequently, a simpler proof – and a slightly better bound – was found by Blomer, Harcos and
Michel [BHM07, Corollary 1]. As is clear from the theorem numbering of these results, these
are only special cases of subconvexity bounds for much more general L-functions (and any such
list of results would be incomplete without the general GL2 -result by Michel and Venkatesh
[MV10]). The convexity bound is more than enough for our needs and any invocation of these
deep results would be pretentious here.
50
The average distribution of primes represented by binary quadratic forms
Let K = A + 2 + K4 for some K4 > max(K1 , K2 , K3 , 3); thus, K = A + 19 is admissible, for
example. We put together the upper bounds (2.58), (2.63) and (2.65) that we have found for
the three summands in (2.44), insert them into (2.42) and get
Ek0 (Q, X) ε (log X)K−A
max
Q0 6Q1 6Q
−1/2+ε
Q1
× X
X 1/2
1/2
(1 +
5/2
Q1 z −1/2 )
2
+ (z +
5/2
Q1 z)
+
Q21
(2.66)
.
Now we set
D := max(D1 , 4K)
(2.67)
in Remark 2.23. We can assume without loss of generality that ε 6 41 . If Q > Q0 = (log X)D ,
1/2−ε
then Q1
is therefore at least (log X)K and if we choose z = Q4−ν+2ε
(log X)2K , we get
1
Ek0 (Q, X) ε (log X)−A Qν/2 (X + (log X)5K X 1/2 Q(15−5ν)/2+5ε ).
This gives
Ek0 (Q, X) ε Qν/2 X(log X)−A
if Q15−5ν+10ε 6 X(log X)−B where B = 10K; that is, we may choose
B = 10A + 190.
(2.68)
Remark 2.26. If we assume the Lindelöf Hypothesis, we may use the conditional large sieve
inequality (2.18) instead of (2.17). This leads to the bound
Ek0 (Q, X) ε (log X)K−A
max
Q0 6Q1 6Q
−1/2+ε
Q1
X 1/2
3/2
3/2
× X 1/2 (1 + Q1 z −1/2 ) + (z 2 + Q1 z) + Q21
instead of (2.66), and this yields
Ek0 (Q, X) ε Qν/2 X(log X)−A
if either ν > 21 +2ε and Q6−3ν+6ε 6 X(log X)−6K , or ν 6 12 +2ε and Q7−5ν+10ε 6 X(log X)−10K .
2.3.4
Real class group characters
Before approaching the second sum Ek00 (Q, X) in (2.40), we note that each of the functions
e
ψ(X;
q, χ) for real class group characters χ can be written as the sum of two Chebyshev functions
b
for Dirichlet characters: If q ∈ F and χ ∈ H(q)
is a real class group character, it is given by
(
χ(p) = χd1 ,d2 (p) :=
χd1 (N(p))
χd2 (N(p))
if p - d1 ,
if p - d2 ,
(2.69)
for all prime ideals p ∈ Z(q) and then defined, by extension, for all non-zero fractional ideals.
Here d1 and d2 are two (positive or negative) fundamental discriminants with d1 d2 = q and,
as before, χd denotes the Kronecker symbol ( d· ), which is the unique primitive real Dirichlet
character modulo |d| if d 6≡ 0 (mod 8) (which we assume in this chapter). On the other hand,
every such factorization of q gives a real class group character of H(q). Note that the trivial
(q)
class group character χ0 corresponds to the trivial factorization q = 1 · q (see [IK04, p. 510]).
51
2.3 Results of Bombieri–Vinogradov type
By the Kronecker Factorization Formula [Iwa97, Theorem 12.7], the L-function associated
to the real class group character χd1 ,d2 factors as
L(s, λχd1 ,d2 ) = L(s, χd1 )L(s, χd2 )
(2.70)
into Dirichlet L-functions for all fundamental discriminants d1 and d2 . Thus
L0 (s, λχd1 ,d2 )
L(s, λχd1 ,d2 )
=
L0 (s, χd1 ) L0 (s, χd2 )
+
.
L(s, χd1 )
L(s, χd2 )
(2.71)
Let F(Q) denote the set of all (positive or negative) fundamental discriminants d 6= 1 with
|d| 6 Q and set
1 X
X k
ψk (X; χd ) :=
χd (n)Λ(n) log
(2.72)
k! n6X
n
for every d ∈ F(Q). Then we have
1
ψk (X; χd ) = −
2πi
Z
(c)
L0
(s, χd )X s s−(k+1) ds
L
(2.73)
for each c > 1, by the method of [MV07, §5.1]. Therefore, (2.71) and (2.73) imply
Ek00 (Q, X) 6
X
X
d1 ∈F(Q)
d2 ∈F(Q)
d1 d2 ∈M 00 (Q)
= 2
X
1
(|ψk (X; χd1 )| + |ψk (X; χd2 )|)
h(d1 d2 )
X
|ψk (X; χd1 )|
d1 ∈F(Q)
d2 ∈F(Q)
d1 d2 ∈M 00 (Q)
1
.
h(d1 d2 )
The class number bound (2.33) for the discriminants in M 00 (Q) yields
Ek00 (Q, X) (log Q)
1
|ψk (X; χd1 )|
|d |1/2
∈F(Q) 1
X
d1
X
d2 ∈F(Q)
d1 d2 ∈M 00 (Q)
1
.
|d2 |1/2
By the assumptions – regarding the divisor frequency ν, which we defined in Definition 2.12 –
for M (Q) in Theorem 2.13 and for Π(Q) in Theorem 2.14, the sum over d2 has at most Qν
terms. Hence
Ek00 (Q, X) (log Q)
1
|ψk (X; χd1 )|
|d |1/2
∈F(Q) 1
X
d1
1
X
1/2
d
d2 6min(Qν , |dQ | ) 2
,
1
which implies, by dyadic decomposition,
Ek00 (Q, X) (log Q)2 Qν/2
Q1
+ (log Q)2 Q1/2
=
max
6Q1−ν
max
−1/2
X
Q1
Q1−ν 6Q1 6Q
|ψk (X; χd1 )|
d1 ∈F(Q1 )
Q−1
1
X
|ψk (X; χd1 )|
d1 ∈F(Q1 )
00
00
E1;k
(Q, X) + E2;k
(Q, X), say.
Note that we cannot profit here from the fact that M 00 (Q) does not contain any small discriminants, which were already handled by means of Goldstein’s generalization of the Siegel–Walfisz
52
The average distribution of primes represented by binary quadratic forms
theorem. Instead, we may use the original Siegel–Walfisz theorem to handle the small discriminant divisors d1 here.
In fact, we have now basically reduced the problem to the analogous problem for Dirichlet
characters, i.e. we are in a similar position as in the original Bombieri–Vinogradov theorem, the
only differences being:
−1/2
00 (Q, X) above has the factor Q
(a) The first term E1;k
in front of the sum (coming from the
1
−1
class number estimate) instead of Q1 (coming from the Euler totient function estimate)
in the classical case. This will lead to a smaller admissible Q for ν < 1.
(b) Our sums are only over real primitive characters modulo |d1 | with |d1 | 6 Q; by positivity,
we can, of course, include the non-real primitive Dirichlet characters as well.
We proceed like in Section 2.3.3, but using the large sieve inequality for Dirichlet characters.
We skip the explicit calculations as they are the same as in [Bom87] and obtain (compare the
inequality at the bottom of page 62 and the top of page 63 in [Bom87])
X
|ψk (X; χd1 )| X(log X)4 + X(log X)4 Q21 z −1 + X 1/2 (log X)6 z 2
d1 ∈F(Q1 )
+ X 1/2 (log X)6 Q21 + X 1/2 (log X)2 Q41 z −2 =: G(X, Q1 , z).
0
Here the variable z denotes the ordinate at which we truncate the Dirichlet series of LL in the
corresponding Gallagher identity (compare (2.43)) and it will be chosen in a moment.
We obtain
00
E1;k
(Q, X) Qν/2 (log X)2
max
Q1
−1/2
6Q1−ν
Q1
G(X, Q1 , z)
and we want to bound the right-hand side with Qν/2 X(log X)−A . This can be achieved when
we set
3/2
z = Q1 (log X)6+A
if the maximum above is attained for
(log X)12+2A 6 Q1 6 X 1/5 (log X)−8−6A/5 .
If the maximum is attained for a smaller Q1 , we use the relation
Z X
ψk (X; χ) =
dt
t
ψk−1 (t; χ)
1
(2.74)
and the Siegel–Walfisz theorem 2.2, in the form (see [MV07, Corollary 11.18], for example)
ψ0 (X; χ) A Xe−c
√
log X
,
(2.75)
which holds with some absolute positive constant c for all q 6 (log X)12+2A and all non-principal
Dirichlet characters χ modulo q, to get the desired bound. Altogether, we thus have
00
E1;k
(Q, X) A Qν/2 X(log X)−A
if Q5−5ν 6 X(log X)−B for some B = B(A) > 0.
Similarly,
00
E2;k
(Q, X) Q1/2 (log X)2
max
Q1−ν 6Q1 6Q
Q−1
1 G(X, Q1 , z)
(2.76)
53
2.3 Results of Bombieri–Vinogradov type
is bounded by Qν/2 X(log X)−A if we set z = Q1 Q1/2−ν/2 (log X)6+A and if the maximum is
attained for
(log X)12+2A 6 Q1
and
Q 6 X 1/(5−3ν) (log X)−(40−6A)/(5−3ν) .
Together with the Siegel–Walfisz theorem this leads to the bound
00
E2;k
(Q, X) A Qν/2 X(log X)−A
(2.77)
if Q5−3ν 6 X(log X)−B for some B = B(A) > 0. Since this range is shorter than the range for
00 (Q, X) in (2.76), we have
E1;k
Ek00 (Q, X) A Qν/2 X(log X)−A
if Q5−3ν 6 X(log X)−B for some B = B(A) > 0. Note that we may choose B = 40 + 6A, which
is smaller than the B-value (2.68) that we have found at the end of Section 2.3.3.
Remark 2.27. It is not possible to achieve a larger range for the above bound on Ek00 (Q, X) on
the assumption of the Lindelöf Hypothesis for Dirichlet characters because the corresponding
large sieve inequality, Lemma 2.6, is already optimal. However, the conditional range for the
bound on Ek0 (Q, X) (see Remark 2.26) is never larger than the unconditional range for the
bound on Ek00 (Q, X), which is why the former range yields the ranges given in Theorem 2.17.
Remark 2.28. For ν < 1, we could also employ Heath-Brown’s large sieve inequality for real
Dirichlet characters [HB95]: Let (an ) be a sequence of complex numbers and let ε > 0; then
2
X X
an χ(n) ε (QX)ε (Q + X)
χ
X
an1 an2 ,
(2.78)
n1 n2 =
n6X
where the outer sum on the left side is over real primitive Dirichlet characters modulo q 6 Q
and the sum on the right side is over pairs of integers n1 , n2 6 X for which n1 n2 is a square.
For ν < 1, this inequality yields a larger range for the discriminants in this section, but it
requires a more careful analysis due to the distinct form of the sum on the right side; since we
are anyway limited by the much shorter range coming from Ek0 , this gives no overall gain and
therefore we will not delve into this. Note that (2.78) does not seem to be applicable for ν = 1:
The term X 1+ε in (2.78) does not permit us to beat the bound (2.24) since our method can
only compensate powers of (log X) when ν = 1, but not a genuine X ε .
2.3.5
Conclusion of the proofs
The proofs of Theorem 2.13 and Theorem 2.14 are now complete, but we recall the main steps
as well as the bounds which we have found in the last three subsections and thus tie the loose
ends:
P
We started by noting that, by transition to ideal classes, we have to bound q∈M (Q) Ek (X; q)
where Ek is defined by (2.27). In Proposition 2.20 we proved that the contribution of exceptional
discriminants is negligible or at most of the desired size
O Qν/2 X(log X)−A
if ν > 0 and
O X(log X)k+3
if ν = 0.
54
The average distribution of primes represented by binary quadratic forms
Then we showed, by means of Lemma 2.22, that the contribution of small discriminants q of
size |q| 6 (log Q)D is also acceptable. The exponent D was bounded below in Remark 2.23 and
then defined in (2.67). Since it depends on the parameters A, k and ν, the implied constants
in the bounds of Theorem 2.13 and Theorem 2.14 also depend on all these parameters; the
dependence is ineffective because of the ineffectivity of Lemma 2.22.
Next, we demonstrated that the functions ψk (Y ; q, χ) do not differ much from the k-th Riesz
typical means ψek (Y ; q, χ) corresponding to the logarithmic derivatives of the L-functions which
are associated to the ideal class group characters χ (see (2.38)) as long as we can guarantee that
an additional term of size QX 1/2 (log X)k+3 is acceptable – which we can do, because we see in
(2.79) below that Q must be much smaller than X. The contribution of large discriminants q of
size |q| > (log Q)D was then estimated for complex and real class group characters separately:
In Section 2.3.3 we showed that the contribution coming from complex class group characters
is of the desired size if
Q15−5ν+ε 6 X(log X)−B
(2.79)
for all ε > 0 and all ν > 0; moreover, we may choose
B = 10A + 190.
(2.80)
The contribution depends (effectively) on the parameter ε and therefore the bounds of
Theorem 2.13 and Theorem 2.14 also depend on ε. (In Remark 2.26 we also found the admissible range which holds if we assume the Lindelöf Hypothesis.)
Finally, in Section 2.3.4, we showed that the contribution coming from real class group
characters is of the desired size if
Q5−3ν 6 X(log X)−B
and we may choose B = 6A + 40. Since this range is larger than (2.79) and this value of
B is smaller than (2.80), the final admissible range and the final admissible value of B for
Theorem 2.13 and Theorem 2.14 are given by (2.79) and (2.80), respectively. This finishes both
proofs.
2.3.6
Proofs of the Bombieri–Vinogradov type results
We close this section with the proofs of Theorem 2.15 and Theorem 2.16:
Proof of Theorem 2.15. Fix A > 0 and ε > 0. By (1.6), we have
w(C, p` ) 6 ` + 1
for all form classes C, all primes p and all positive integers `. Moreover, (1.6) and (1.7) also
yield
X
X
(log p)w(C, p) = e(C)
(log p) + O((log Y )(log |q|)).
p6Y
p6Y
p∈R(q,C)
Thus, for all q ∈ F(Q), all C ∈ K(q) and all Y 6 X, we have
ψ(Y ; q, C) −
Y
Y 6 ψ0 (Y ; q, C) −
+ O(Y 1/2 (log Y )3 + (log Y )(log |q|)).
e(C)h(q)
h(q)
Summing over q ∈ F(Q), we see that the remainder term is negligible in Theorem 2.15.
(2.81)
55
2.3 Results of Bombieri–Vinogradov type
The unsmoothing process from ψk (Y ; q, C) to ψ0 (Y ; q, C) is similar to the one for the original
Chebyshev functions (see [Bom87, §7.4]); we include it for the sake of completeness: Since
ψk (X; q, C) is positive and increasing, the mean value theorem implies that
α
−1
Z eα X
Z X
dt
ψk (t; q, C)
6 ψk (X; q, C) 6 α−1
−α
t
e X
X
ψk (t; q, C)
dt
t
for every α > 0. Combining these inequalities with
Z X
ψk−1 (t; q, C)
ψk (X; q, C) =
1
eα ,
and the Taylor expansion of
max max ψk (Y ; q, C) −
C∈K(q) Y 6X
dt
t
we deduce that
Y X
Y + α−1 max max
ψ
(Y
;
q,
C)
−
α
.
k+1
h(q)
h(q)
h(q)
C∈K(q) Y 6eα X
0
If we now set α = log(X)−(A +1)/2 and if
X
max max ψ2 (Y ; q, C) −
q∈F(Q)
C∈K(q) Y 6X
Y 0
Q1/2 X(log X)−A
h(q)
(2.82)
holds for some A0 > 0, then
X
q∈F(Q)
max max ψ1 (Y ; q, C) −
C∈K(q) Y 6X
X
Y X
0
α
.
Q1/2 X(log X)−(A −1)/2 +
h(q)
h(q)
q∈F(Q)
0
Repeating the procedure with α0 = log(X)−((A −1)/2+1)/2 (which is larger than α) we get
X
max max ψ0 (Y ; q, C) −
q∈F(Q)
C∈K(q) Y 6X
X
X
Y 0
α0
.
Q1/2 X(log X)−(A −3)/4 +
h(q)
h(q)
q∈F(Q)
(2.83)
The second term on the right side is at most
0
X(log X)−(A +1)/4
X
q∈F(Q)rFex
q∈Fex (Q)
X(log X)
−(A0 +1)/4
X
1+
((log log Q) + Q
1/2
1 h(q)
(Q)
(2.84)
1/2
(log Q)) Q
−(A0 −3)/4
X(log X)
,
which follows by using the bounds |Fex (Q)| log log Q and h(q) |q|1/2 (log |q|)−1 for
q ∈ F(Q) r Fex (Q), which we have found in Section 2.3.2.
Therefore, Theorem 2.15 will follow from (2.81), (2.83) and (2.84) as soon as we prove the
bound (2.82) for
Q10+ε 6 X(log X)−B
(2.85)
with B = B(A0 ) = 10A0 + 190 and then set A0 = 4A + 3. We split the left side of (2.82) into
X
max max ψ2 (Y ; q, C) −
q∈F(Q)
6
C∈K(q) Y 6X
X
max max ψ2 (Y ; q, C) −
q∈F(Q)
+
Y h(q)
C∈K(q) Y 6X
X
q∈F(Q)
max
Y 6X
X
1
ψ2 (Y ; q, K)
h(q) K∈K(q)
P
Y −
K∈K(q) ψ2 (Y ; q, K)
h(q)
.
(2.86)
56
The average distribution of primes represented by binary quadratic forms
0
The first term on the right side of (2.86) is Q1/2 X(log X)−A by Theorem 2.13 if (2.85) holds
and B = 10A0 + 190. As for the second term, we note that equation (1.8) yields
X
X
K∈K(q) p6Y
2
Y
(log p) log
p
Y
w(K, p) =
(log p) log
p
p6Y
X
2
(1 + χq (p)).
Thus
X
Y −
ψ2 (Y ; q, K)
K∈K(q)
6 Y − 21
X
X
K∈K(q) p6Y
Y
(log p) log
p
2
w(K, p) + O Y 1/2 (log Y )3
6 Y − ψ2 (Y ) + ψ2 (Y ; χq ) + O Y 1/2 (log Y )3 ,
where ψ2 (Y ; χq ) was defined in (2.72) and where we have set ψ2 (Y ) := ψ2 (Y ; 1). Summing over
q ∈ F(Q), we see that the remainder term is negligible in Theorem 2.15. By (2.74) and the
Prime Number Theorem, we have
Y − ψ2 (Y ) D Y (log Y )−D
for all D > 0. Thus, the bound
X Y − ψ2 (Y )
max
Y 6X
h(q)
q∈F(Q)
Q1/2 X(log X)−A
0
follows from splitting
the sum
into exceptional and non-exceptional discriminants as in (2.84).
As for the term ψ2 (Y ; χq ) above, we first note that
X
max
Y 6X
ψ2 (Y ; χq )
h(q)
q∈Fex (Q)
(log log Q)X(log X)2
0
is negligible if Q is not too small, i.e. if Q > (log X)2A +6 ; but if Q is small, then
max
Y 6X
X ψ2 (Y ; χq )
q∈F(Q)
h(q)
is negligible by the Siegel–Walfisz theorem (see (2.75)). Thus, it remains to bound the sum over
q ∈ F(Q)rFex (Q) and this may be accomplished by means of the original Bombieri–Vinogradov
theorem – or rather the underlying average character sum that we have used in Section 2.3.4
00 with ν = 1 and k = 2 there). Hence we also get
(compare the bound (2.77) for E2;k
max
Y 6X
0
X ψ2 (Y ; χq )
q∈F(Q)
h(q)
Q1/2 X(log X)−A
0
if Q2 6 X(log X)−B for some B 0 = B 0 (A0 ) > 0 (which may be chosen as small as B(A0 ) above).
In summary, the same bound holds for the second term on the right-hand side of (2.86) in the
same range, which is larger than the range (2.85) for which we have bounded the first term.
This finishes the proof of (2.82) in the range (2.85) with B = 10A0 + 190 and therefore it
also concludes the proof of Theorem 2.15.
57
2.4 The mean square distribution
Proof of Theorem 2.16. The assertion of this theorem follows easily by partial integration; we
skip the proof since the corresponding proof for the original Bombieri–Vinogradov theorem is
given in full detail in [Br95, pp. 201/2] and nothing new happens in our case. We just remark
that it comes in handy at this point to have Theorem 2.15 with the feature “maxY 6X ” – which
we got for free in the preceding proofs, but which is a priori not obvious – because the integrals
that appear in the transition from the Chebyshev function to the prime counting function may
thus be estimated trivially.
2.4
The mean square distribution
In this section we consider the “variance” of the distribution of primes which can be represented
by positive definite binary quadratic forms when the discriminant varies over bounded subsets
of the set of negative fundamental discriminants and when we additionally average over all
form classes in the corresponding form class groups. This can be viewed as an analogue of the
Barban–Davenport–Halberstam theorem for primes in arithmetic progressions, Theorem 2.5.
In fact, we show a more general mean square distribution result for arithmetic functions that
satisfy Siegel–Walfisz conditions for both arithmetic progressions and form classes.
2.4.1
Statement of results
The representability of primes is well distributed over almost all form classes to almost all
(negative fundamental) discriminants in long ranges:
Theorem 2.29. Let F(Q) be the set of all negative fundamental discriminants q 6≡ 0 (mod 8)
with |q| 6 Q, let A > 0 be arbitrarily large and let ε > 0 be arbitrarily small. Then
X
X q∈F(Q) C∈K(q)
li(X)
π(X; q, C) −
e(C)h(q)
2
A,ε Q1/2 X 2 (log X)−A
(2.87)
if Q3+ε 6 X(log X)−2A−6 .
In the Barban–Davenport–Halberstam theorem for arithmetic progressions, the prime counting function can be replaced by many other arithmetic functions g. Indeed, it suffices to
show that g is well distributed in arithmetic progressions to small moduli in order to prove
that g shows a similar behaviour for almost all residue classes to almost all large moduli; see
[IK04, §17.4] or [Br95, §5.6], for example.
We encounter some difficulties when we try to generalize Theorem 2.29 accordingly, i.e.
when we attempt to show that well-distribution (in a sense) of g with respect to form classes to
small discriminants implies that g shows at most a small deviation from the expected “average
behaviour” for almost all form classes to almost all large discriminants: First, we recall that
every positive integer n may usually be represented by forms from different form classes of a
given discriminant (or by no forms at all), whereas n always lies in exactly one residue class
of a given modulus. Secondly – and very much related to the first point –, we have seen that,
for example, non-ambiguous classes represent about twice as many primes as ambiguous classes
(the factor e(C) accounts for this fact in (2.87)). Regarding Proposition 1.6 as well as the
subsequent remarks and discussion, it is reasonable to say (and we have already vaguely used
that notion in this sense in Section 2.3) that a function g is well distributed with respect to form
classes when the weighted difference
X
n6X
w(C, n)g(n) −
X X
1
w(K, n)g(n)
h(q) K∈K(q)
n6X
(2.88)
58
The average distribution of primes represented by binary quadratic forms
is small for all form classes C of some discriminant q ∈ F; recall that the weights w(C, n) are
given by (1.4) or equivalently by (1.5). We will assume that this kind of well-distribution with
respect to form classes holds uniformly for all “small” negative fundamental discriminants.
Moreover, as seen in the proofs of the preceding section, sums which involve real class group
characters may be reduced to sums which involve Dirichlet characters. Thus, we will also require
the function g to be well distributed in arithmetic progressions to all small moduli and the sums
X
X
g(n)
n6X
χ1 (k)χ2 (m)
1<k,m<n
km=n
to be small for most pairs (χ1 , χ2 ) of distinct primitive real Dirichlet characters.
Under these assumptions, we will prove that (2.88) is small for almost all form classes to
almost all discriminants in long ranges:
Theorem 2.30. Let 3 6 Q 6 X, let M (Q) be any subset of F(Q) and let g be an arithmetic
function. Assume that
D(g; X; q, C) :=
X
w(C, n)g(n) −
n6X
X X
1
w(K, n)g(n)
h(q) K∈K(q)
n6X
(2.89)
!1/2
L X
1/2
−L
(log X)
X
2
|g(n)|
n6X
for all L > 0, all q ∈ F(Q) and all form classes C ∈ K(q). Also assume that
X
n6X
n≡a (mod q)
!1/2
X
1
g(n) −
g(n) L X 1/2 (log X)−L
ϕ(q)
X
n6X
(n,q)=1
|g(n)|2
(2.90)
n6X
(n,q)=1
for all L > 0, all q ∈ F(Q) and all integers a with (a, q) = 1. Set
R(g, Q, X) :=
X
X
|d1 |>1
|d2 |>1
d1 d2 ∈M (Q)
1 X
g(n)
h(d1 d2 ) n6X
X
1<k,m<n
km=n
2
χd1 (k)χd2 (m) ,
where the outer sums run over (positive and negative) fundamental discriminants and χd denotes
the primitive real Dirichlet character modulo |d|.
Then
X
X
|D(g; X; q, C)|2
q∈M (Q) C∈K(q)
A,ε Q1/2 X 1/2 Q3/2+ε (log X)3 + X 1/2 (log X)−A
X
|g(n)|2 + R(g, Q, X)
(2.91)
n6X
for all arbitrarily large A > 0 and all arbitrarily small ε > 0.
Remark 2.31. Assumption (2.89) is “non-trivial” only for |q| < (log X)8L+32 since
X
n6X
ω(C, n)g(n) X 1/4
+ X 1/8 X 1/4 (log X)4
|q|1/8
X
|g(n)|2
1/2
n6X
by the Cauchy–Schwarz inequality and the easy bounds (1.14) and (2.8). Similarly, the second
assumption (2.90) is non-trivial only for |q| < (log X)2L .
59
2.4 The mean square distribution
Theorem 2.29 follows easily from Theorem 2.30: Let g be the characteristic function of the
primes. Assumption (2.90) holds by Remark 2.31, the Siegel–Walfisz theorem for arithmetic
progressions (Theorem 2.2) and the Prime Number Theorem. As for assumption (2.89), we
have
X
X X
1
D(g; X; q, C) =
w(C, p) −
w(K, p)
h(q) K∈K(q)
p6X
= π(X; q, C)e(C) −
p6X
1
(1 + χq (p)) + O(log |q|)
h(q) p6X
X
by (1.8). Assumption (2.89) now follows from Remark 2.31, the Siegel–Walfisz theorem in the
form (2.75), the Prime Number Theorem and the Siegel–Walfisz theorem for binary quadratic
forms (Theorem 2.1). The term R(g, Q, X) vanishes. Thus, from (2.91) we get
X
2
X π(X; q, C)e(C) −
q∈F(Q) C∈K(q)
1 X
(1 + χq (p))
h(q) p6X
A,ε Q1/2 X 2 (log X)−A
(2.92)
if Q3+ε 6 X(log X)−2A−6 . Similarly to the argument in Section 2.3.2, one shows that the
contribution from exceptional discriminants to the left side of (2.92) is negligible (also compare
the corresponding argument in the next subsection). Thus, we may assume the class number
bound (2.33). Dyadic decomposition and the large sieve inequality for Dirichlet characters
(Lemma 2.6) then yield
2
1 X
χq (p) (log Q)2 (Q3/2 X + X 2 ).
h(q) p6X
q∈F(Q)
X
(2.93)
Therefore, Theorem 2.29 follows from (2.92), (2.93) and the Prime Number Theorem if
Q > (log X)2A+4 . If Q is smaller, Theorem 2.29 follows directly from Theorem 2.1.
Remarks:
(1) The term R(g, Q, X) clearly vanishes if the function g is supported on primes only or
if the set M (Q) contains only prime discriminants, for example. Thus, we get a clean
well-distribution result in these cases. It would be interesting to find other cases in which
R(g, Q, X) is dominated by the first term on the right-hand side of (2.91).
(2) The proof of Theorem 2.30 will be very similar to what was done in the preceding section
(and, of course, similar to the proof of the Barban–Davenport–Halberstam theorem for
arithmetic progressions). In fact, most of the steps will be much simpler. It will be clear
from the proof that we could have also proved results for sets with divisor frequency ν < 1
as in Theorem 2.13 and Theorem 2.14.
(3) By the easy estimate (1.14) and the class number bound |q|1/2 (log |q|)−1 h(q) (for
non-exceptional discriminants), we have the “trivial” estimate
X
X q∈F(Q) C∈K(q)
π(X; q, C) −
li(X)
e(C)h(q)
2
X
X X2
q∈F(Q) C∈K(q)
|q|
+
X2
h(q)2
Q1/2 (log Q)X 2 .
Thus, we save once again an arbitrary power of (log X) over this estimate.
Assuming the Lindelöf Hypothesis as in Theorem 2.17, we can increase the admissible range for
the fundamental discriminants in Theorem 2.29 almost up to Q ≈ X:
Theorem 2.32. Assume the Lindelöf Hypothesis for Rankin–Selberg convolutions of holomorphic cusp forms of weight one (or only assume the bound (2.16) for all pairs of complex
class group characters). Then Theorem 2.29 holds for Q1+ε 6 X(log X)−2A−6 .
See Remark 2.33 for the proof of this conditional result.
60
The average distribution of primes represented by binary quadratic forms
2.4.2
Preliminaries
First, we consider the contribution coming from the initial range of negative fundamental
discriminants. Fix A > 0. Set Q0 = (log X)L0 for some L0 > 0, which will be chosen
later and which will depend on A only. By assumption (2.89) and the class number bound
h(q) |q|1/2 (log |q|), the contribution to the left-hand side of (2.91) coming from discriminants
q with |q| 6 Q0 is
3/2
L1 (log Q0 )Q0 X(log X)−L1
X
1/2
|g(n)|2 L1 Q0 X(log X)L0 −L1 +1
X
|g(n)|2
n6X
n6X
for each L1 > L0 . This is dominated by the right-hand side of (2.91) if
L0 − L1 + 1 6 −A.
(2.94)
It remains to consider the large discriminants, i.e. all q in
M 0 (Q) := {q ∈ M (Q) : Q0 < |q| 6 Q}
and we may assume from now on that Q > Q0 . By the definition (1.4) of the weights w(C, n),
we have
X
X
1
g(N(a)) −
D(g; X; q, C) =
g(N(a)).
h(q)
a∈Bq (C)∩Z(q)
N(a)6X
a∈Z(q)
N(a)6X
b
For every q ∈ F and every χ ∈ H(q),
we set
G(X; χ, q) :=
X
g(N(a))χ(a) =
X
g(n)λχ (n).
n6X
a∈Z(q)
N(a)6X
By the orthogonality property of ideal class group characters, we may rewrite D(g; X; q, C) as
X
D(g; X; q, C) =
g(N(a))
a∈Z(q)
N(a)6X
=
1
h(q)
X
X
1
1
χ(Bq (C))χ(a) −
h(q)
h(q)
b
χ∈H(q)
X
a∈Z(q)
N(a)6X
χ(Bq (C))G(X; χ, q).
(q)
χ∈H(q)r{χ0 }
b
Moreover, orthogonality also yields
X 2
X
(q)
C∈H(q) χ∈H(q)r{χ
b
0 }
X
=
χ(C)G(X; χ, q)
G(X; χ1 , q)G(X; χ2 , q)
(q)
C∈H(q)
b
χ1 ,χ2 ∈H(q)r{χ
0 }
= h(q)
X
|G(X; χ, q)|2 .
(q)
b
χ∈H(q)r{χ
0 }
X
χ1 (C)χ2 (C)
g(N(a))
61
2.4 The mean square distribution
Thus, the contribution from large discriminants to the left-hand side of (2.91) is
2
X D(g; X; q, Bq−1 (C))
X
q∈M 0 (Q) C∈H(q)
=
=
1
2
(h(q))
q∈M 0 (Q)
X
1
h(q)
q∈M 0 (Q)
X
X 2
X
(q)
C∈H(q) χ∈H(q)r{χ
b
0 }
X
χ(C)G(X; χ, q)
|G(X; χ, q)|2 .
(q)
b
χ∈H(q)r{χ
0 }
The contribution coming from exceptional discriminants is again negligible if Q is not very
small. Indeed, by the bound |Fex (Q)| log log Q (see the proof of Proposition 2.20) for the
set of exceptional fundamental discriminants q ∈ F(Q), the Cauchy–Schwarz inequality and the
bound (2.8), we have
1
h(q)
(Q)
X
q∈Fex
X
|G(X; χ, q)|2 X(log X)4
X
|g(n)|2 .
n6X
(q)
b
χ∈H(q)r{χ
0 }
In particular, the contribution to the left-hand side of (2.91) coming from exceptional discriminants is negligible if Q > (log X)2A+8 . This means that we must choose at least
L0 > 2A + 8
(2.95)
above.
Therefore it remains to estimate the contribution from non-exceptional discriminants, i.e.
we have to bound
X
X
1
|G(X; χ, q)|2
h(q)
q∈M 00 (Q)
b
χ∈H(q)
(q)
χ2 6=χ0
1
h(q)
q∈M 00 (Q)
X
+
(2.96)
X
|G(X; χ, q)|2 ,
(q)
b
χ∈H(q)r{χ
0 }
(q)
χ2 =χ0
where M 00 (Q) = M 0 (Q) r Fex (Q).
2.4.3
Complex class group characters
The lower class number bound (2.33), dyadic decomposition and the large sieve inequality for
complex class group characters (Lemma 2.7) together imply that the first sum in (2.96) is
bounded above by
(log Q)2
max
Q0 6Q1 6Q
ε (log Q)2
max
Q0 6Q1 6Q
Q1/2 X 1/2
−1/2
X
Q1
X
q∈M 00 (Q1 ) χ∈H(q)
b
(q)
2
χ 6=χ0
−1/2
Q1
5/2+ε
X(log X)3 + X 1/2 (log X)Q1
−1/2
X 1/2 (log X)5 Q−1/2 Q0
|G(X; χ, q)|2
X
|g(n)|2
(2.97)
n6X
+ (log X)3 Q3/2+ε
X
|g(n)|2
n6X
for every ε > 0. This is dominated by the right-hand side of (2.91) if Q > (log X)2A+10−L0 ,
which is certainly satisfied if the above-mentioned condition L0 > 2A + 8 holds.
62
The average distribution of primes represented by binary quadratic forms
2.4.4
Real class group characters
Like in Section 2.3.4, the second sum in (2.96) is handled by reducing it to a sum over real
b
is a real non-trivial class group character, then the
Dirichlet characters. If q ∈ F and χ ∈ H(q)
Kronecker Factorization Formula (2.70) implies that λχ (n) is the Dirichlet convolution
λχ (n) = χd1 ∗ χd2 (n)
(2.98)
of two primitive real Dirichlet characters modulo the absolute values of non-trivial fundamental
b
discriminants d1 and d2 with d1 d2 = q. Thus, if χ ∈ H(q)
is non-trivial and real, then
X
G(X; χ, q) =
g(n)
n6X
X
χd1 (k)χd2 (m)
km=n
for some fundamental discriminants d1 and d2 with d1 d2 = q and |d1 |, |d2 | > 1. Moreover, each
b
such pair of discriminants induces one of the non-trivial real class group characters in H(q).
Let F (Q) denote the set of all fundamental discriminants d with 1 < |d| 6 Q (as in
Section 2.3.4). The second sum in (2.96) can thus be bounded as follows:
1
h(q)
q∈M 00 (Q)
X
=
X
b
χ∈H(q)r{χ
0}
χ2 =χ0
X
X
d1 ∈F (Q)
d2 ∈F (Q)
d1 d2 ∈M 00 (Q)
(log Q)
(log Q)
|G(X; χ, q)|2
1 X
g(n)
h(d1 d2 ) n6X
X
X
d1 ∈F (Q)
d2 ∈F (Q)
d1 d2 ∈M 00 (Q)
2
X
χd1 (k)χd2 (m)
16k,m6n
km=n
X
2
1
g(n)(χ
(n)
+
χ
(n))
+ R(g, Q, X)
d
d
1
2
1/2
|d1 d2 |
n6X
2 X
1 X
1
g(n)χ
(n)
+ R(g, Q, X)
d
1
1/2
1/2
|d |
Q d2
n6X
d1 ∈F (Q) 1
d2 6
X
|d1 |
S1 (Q, X) + S2 (Q, X) + R(g, Q, X),
where
1/2
2
X
g(n)χd (n)
X
S1 (Q, X) = Q0 (log Q)
d∈F (Q0 ) n6X
and
S2 (Q, X) = Q1/2 (log Q)2
max
Q0 6Q1 6Q
Q−1
1
X
X
2
g(n)χd (n)
d∈F (Q1 ) n6X
and R(g, Q, X) was defined in Theorem 2.30. By positivity and orthogonality, we have
1/2
X
S1 (Q, X) 6 Q0 (log X)
X
X
2
g(n)χ(n)
1<d6Q0 χ (mod d) n6X
χ6=χ0
1/2
= Q0 (log X)
X
ϕ(d)
1<d6Q0
X
X
g(n) −
n6X
a (mod d)
(a,d)=1 n≡a (mod d)
2
X
1
g(n) .
ϕ(d)
n6X
(n,d)=1
By assumption (2.90), we thus have
1/2
S1 (Q, X) L2 Q0 X(log X)−L2 +1+3L0
X
n6X
|g(n)|2
(2.99)
63
2.5 Applications and open questions
for all L2 > L0 . Hence, S1 (Q, X) is dominated by the right side of (2.91) if
− L2 + 1 + 3L0 6 −A.
(2.100)
Finally, we use the large sieve inequality for Dirichlet characters, Lemma 2.6, to bound S2 (Q, X).
We get
X
S2 (Q, X) L0 Q1/2 (log X)2 (Q + XQ−1
|g(n)|2 .
0 )
n6X
3/2+ε (log X) + X(log X)−A−2 ,
This is dominated by the right side of (2.91) if Q + XQ−1
0 6 Q
which is certainly true if the above-mentioned condition L0 > 2A + 8 holds.
By (2.95), (2.94) and (2.100) we also see that all implied constants above that depend on
L0 , L1 or L2 , can be made dependent on A only, if we choose L0 = 2A + 8, L1 = A + L0 + 1
and L2 = A + 3L0 + 1, for example. This concludes the proof of Theorem 2.30.
Remark 2.33. If we assume the Lindelöf Hypothesis we may use the conditional large sieve
inequality of Lemma 2.10 instead of Lemma 2.7. Thus, we may then replace the term Q5/2+ε in
the second line of (2.97) by Q3/2+ε ; the term Q3/2+ε in the last line of (2.97) and in (2.91) may
therefore be replaced by Q1/2+ε . Thus, (2.87) holds if Q1+ε 6 X(log X)−2A−6 , which concludes
the proof of Theorem 2.32.
2.5
Applications and open questions
Our results of Bombieri–Vinogradov type and of Barban–Davenport–Halberstam type give
equidistribution results for primes represented by binary quadratic forms in smaller ranges
than the corresponding original theorems for arithmetic progressions. This will probably somewhat limit the scope of their applications. We only give two different types of consequences
which can be derived irrespectively of the size of the admissible ranges in the theorems of
Sections 2.3 and 2.4 and which are analogues of similar known results for primes in arithmetic
progressions. It would be extremely interesting to find applications which genuinely require
mean-value results on primes that are represented by binary quadratic forms and which are
more than mere adaptations of applications of the mean-value results for primes in arithmetic
progressions.
We will close this section by mentioning possible extensions and generalizations of our meanvalue results.
2.5.1
The least prime represented by binary quadratic forms
The most natural question about primes of a specific shape is certainly to ask whether an
infinitude of primes of this shape exist; this was answered by Dirichlet’s theorem for primes
in arithmetic progressions and by Weber’s theorem for primes represented by binary quadratic
forms (Theorem 1.1). However, the question on the size of the least prime of a specific form is
probably a close second.
Both the Bombieri–Vinogradov theorem and the Barban–Davenport–Halberstam theorem
provide information on this question for primes in arithmetic progressions. Linnik [Lin44] proved
that there exist absolute, effectively computable constants c and L such that p(q; a), the least
prime p ≡ a (mod q) with (a, q) = 1, is at most cq L . The best upper bound for the constant L
is currently L = 5.18 which is due to Xylouris [Xyl11]. It is easy to show that, conditionally
on the Generalized Riemann Hypothesis, L = 2 + ε is admissible for every ε > 0; on the other
64
The average distribution of primes represented by binary quadratic forms
hand, Friedlander and Iwaniec [FI03b] proved the remarkable fact that one can take L = 1.983
if Landau–Siegel zeros exist (and satisfy certain properties).
Being average estimates, the Bombieri–Vinogradov theorem and the Barban–Davenport–
Halberstam theorem are not capable of giving bounds which hold for all arithmetic progressions.
However, one can achieve L = 2 + ε for all residue classes in almost all progressions and even
L = 1 + ε for most residue classes in almost all progressions.
Likewise our variants of these theorems can provide bounds for the size of the least prime
number which can be represented by a given positive definite binary quadratic form with the
bounds holding for all or almost all form classes and almost all fundamental discriminants.
In fact, using Theorem 2.29, we can show that the smallest prime represented by any binary
quadratic form in a given class of a given discriminant is, for all ε > 0, less or equal |q|3+ε for
most classes of forms and most discriminants q:
Theorem 2.34. For each negative fundamental discriminant q 6≡ 0 (mod 8) and each form class
C ∈ K(q), let
p(q; C) = the least prime which is represented by all binary quadratic forms in C.
Then, for each ε > 0, there exists a subset S = S(ε) of the set F of negative fundamental
discriminants q 6≡ 0 (mod 8) such that
S ∩ [−N, −1]
= 1,
N →∞ F ∩ [−N, −1]
lim
i.e. S has asymptotic density 1 in F, and
|{C ∈ K(qn ) | p(qn ; C) 6 |qn |3+ε }|
=1
n→∞
h(qn )
lim
(2.101)
holds for each sequence (qn ) in S with |qn | → ∞ as n → ∞.
Remark. Elliott and Halberstam [EH71] proved the analogous theorem for arithmetic progressions (with exponent 1+ε) and Hinz [Hin81] generalized their result to arbitrary (fixed) number
fields.
Proof. Fix ε > 0 and set A = 3 + ε. Let Q0 = Q0 (A, ε) be sufficiently large, i.e. such that
log
Q
2
2A+7
6
Q
2
ε
and
Q3+2ε 6
Q
2
3+2ε
·
(log Q/2)2A+7
(log Q4 )2A+6
(2.102)
for all Q > Q0 . For each Q > Q0 , we fix X = X(Q, A, ε) with X(log X)−2A−6 = Q3+2ε and we
assume that Q4 > X (otherwise we increase Q0 ). Moreover, let T (Q) be the (possibly empty)
subset of the fundamental discriminants in the interval [−Q, −Q/2) such that q lies in T (Q) if
and only if
p(q; C) > |q|3+3ε
holds for at least h(q)(log |q|)−(A−3) classes C ∈ K(q).
Fix Q > Q0 . Then
p(q; C) >
Q
2
3+3ε
>
Q
2
3+2ε Q
log
2
2A+7
> X(log X)−2A−6 (log Q4 )2A+6 > X
65
2.5 Applications and open questions
holds for at least h(q)(log |q|)−(A−3) classes C ∈ K(q) for all q ∈ T (Q); here we have used condition (2.102) and the choice of X. Thus, we have π(X; q, C) = 0 for at least h(q)(log |q|)−(A−3)
classes, and therefore
(li(X))2
Q1/2 X 2
h(q)
·
,
ε,A
(log |q|)A−3 (h(q))2
(log X)A
q∈T (Q)
X
by Theorem 2.29. Hence
Q(log Q)A−4
Q
6
.
A−2
(log X)
(log Q)2
|T (Q)| ε,A
Set
[
T =
m: Q0
T (2m ).
62m
Then |T ∩ [−N,−1]|
(log N )−1 → 0 as N → ∞. Therefore, if S is the complement of T in F,
N
then S has asymptotic density 1 and if (qn ) is a sequence in S with |qn | → ∞ as n → ∞, then
h(qn ) > |{C ∈ K(qn ) | p(qn ; C) 6 |qn |3+3ε }| > h(qn ) − h(qn )(log |qn |)−(A−3)
for all n ∈ N, hence (2.101) holds.
Our variant of the Bombieri–Vinogradov theorem, Theorem 2.16, gives a weaker bound, but
this time it holds for all classes of forms and most discriminants:
Theorem 2.35. For every ε > 0, the upper bound
max p(q; C) 6 |q|10+ε
C∈K(q)
may only fail for fundamental discriminants q lying in a set V = V (ε) ⊂ F that has asymptotic
density 0 in F.
We skip the proof since it is very similar to the one of Theorem 2.34 above.
Remarks:
(1) From the Siegel–Walfisz theorem for binary quadratic forms, Theorem 2.1, it follows easily
that there exists an absolute constant L such that
max p(q; C) |q|L(log |q|)
C∈K(q)
for all q ∈ F.
(2) Kowalski and Michel have proved in [KM02] a log-free zero-density estimate for automorphic forms on GL(n)/Q and described how this can be used to show the existence of
an absolute constant L such that
max p(q; C) |q|L
C∈K(q)
(2.103)
for all q ∈ F. This bound is also a consequence of earlier results by Fogels [Fog65, Fog67]
and Weiss [Wei83]. However, no explicit admissible value for L has yet been published.
(3) The Generalized Riemann Hypothesis for the Dedekind zeta-functions of the Hilbert class
√
fields of the fields Q( q) implies (see Theorem 1.16) that (2.103) holds for all q ∈ F with
L = 1 + ε for all ε > 0.
66
The average distribution of primes represented by binary quadratic forms
(4) Assuming the Lindelöf Hypothesis as in Theorem 2.17 and Theorem 2.32, one may replace
the exponent 3+ε by 1+ε in Theorem 2.34 and the exponent 10+ε by 3+ε in Theorem 2.35.
Focussing on the primes of the shape x2 + ny 2 , which are mentioned in the title of this
thesis, that is, on primes represented by the principal class of discriminant −4n, it would be
interesting to investigate bounds for the values of xmin and ymin that yield the smallest prime
of this form for any given positive integer n. One would naturally assume that ymin is typically
very small. Notwithstanding, it is somewhat surprising that numerical calculations even suggest
that ymin > 1 can only occur for an exceedingly small set of values n: Up to at least n = 108 ,
the smallest prime of the form x2 + ny 2 is actually of the form x2 + n in all but the eleven cases
n ∈ {5, 41, 59, 314, 341, 479, 626, 749, 755, 881, 1784};
in all these exceptional cases we have ymin = 2. If we could show that ymin = 1 for all n > 1784
(which appears to be formidable) or could at least get a nice bound for the number of exceptions,
the problem of bounding the least prime of the form x2 + ny 2 would reduce to bounding the
smallest prime of the form x2 + n. Although this polynomial looks simpler than our original
one, there are questions on the prime numbers which it represents that are so much tougher
than for binary quadratic forms: There is no integer n > 0 for which it is nowadays known
whether there are infinitely many primes of the form x2 + n. Nevertheless, Baier and Zhao
[BZ07] proved that, given A, B > 0,
2
X
2
2
NX
Λ(x
+
n)
−
G(n)X
,
(log X)B
x6X
n6N
X
µ(n)2 =1
where
!
G(n) =
Y
p>2
X 2 (log X)−A
χn (p)
1−
,
p−1
holds if
6N 6
Note that G(n) converges and G(n) (log n)−1 (log X)−1 .
Thus, using a similar argument as in the proof of Theorem 2.34, we can conclude:
X 2.
Theorem 2.36. For every ε > 0, the smallest prime of the form x2 + n is less than n2+ε for
at least almost all positive squarefree integers n, with the possible exception of a set of integers
with asymptotic density 0.
Consequently, conditional on the assumption that p0 (n), the least prime of the form x2 +ny 2 ,
is attained for y = 1 for all positive squarefree integers n in a set of asymptotic density 1, we
have
p0 (n) 6 n2+ε
for all positive squarefree integers n in a set of asymptotic density 1.
2.5.2
A mean-value estimate for products of two primes
Strictly speaking not a real application but rather another mean-value result of Bombieri–
Vinogradov type – this time for the sequence of numbers that are products of two primes – can
be derived from a combination of the results of Sections 2.3 and 2.4:
Theorem 2.37. Let ε > 0 and A > 0. Let M2 (Q) be the set of positive squarefree integers
n ≡ 1 (mod 4) with n 6 Q. For each n ∈ M2 (Q), let π2 (X; n) denote the number of integers
k 6 X which can be written in the two shapes
k = x2 + ny 2
and
k = p1 p2
67
2.5 Applications and open questions
for some integers x and y as well as some primes p1 and p2 with p1 , p2 6 X 1/2 both of which
can be represented by positive definite binary quadratic forms of discriminant −4n. Then there
exists a constant B = B(A) such that
π2 (X; n) − h(−4n) + 2ω(n)−3 ·
4
n∈M2 (Q) X
li(X 1/2 )
h(−4n)
!2 1/2
A,ε Q X
(log X)A
if Q20+ε 6 X(log X)−B .
Remark. With more effort it is possible to derive similar statements for integers k = p1 · · · p`
with any fixed number ` of prime factors and p1 6 X a1 , . . . , p` 6 X a` , where a1 6 . . . 6 a` and
a1 +· · ·+a` = 1. The resulting admissible range for the discriminants is then always determined
by the smallest exponent a1 .
This type of result is due to Barban [Bar66, Theorem 3.3] in the case of primes in arithmetic
progressions and it was generalized to arbitrary number fields by Hinz [Hin81]. Their results
do not use results of Bombieri–Vinogradov type but only appropriate versions of the Barban–
Davenport–Halberstam theorem. We also have to use Theorem 2.16 because of the difference
(in the factor e(C)) between the asymptotic behaviours of the numbers of primes represented
by ambiguous or by non-ambiguous forms.
Proof. Set Y = X 1/2 and fix a positive squarefree integer n ≡ 1 (mod 4) with n 6 Y . For all
k ∈ N and all C ∈ H(−4n), let δ(C, k) = 1 if there exists an ideal a ∈ C with N(a) = k,
and δ(C, k) = 0 otherwise. If k is of the form k = x2 + ny 2 , then k can be represented by
the principal form of discriminant −4n; hence δ(P (−4n), k) = 1 by Lemma 1.4. Since we also
assume that k = p1 p2 , where p1 and p2 are primes with p1 , p2 6 Y and both can be represented
by positive definite binary quadratic forms of discriminant −4n, we must have δ(C1 , p1 ) = 1
and δ(C2 , p2 ) = 1 for two classes C1 , C2 ∈ H(−4n) with C1 C2 = P (−4n), i.e. C2 = C1−1 .
(1)
Let π2 (X; n) be the √
number of pairs of primes p1 6 p2 6 Y with p1 p2 6 X such that
both p1 and p2 split in Q( −n) and both are representable by forms of an ambiguous class of
discriminant −4n; then
(1)
π2 (X; n) =
=
1
2
1
2
X
X
X
δ(C, p1 )δ(C −1 , p2 ) + O(Y )
p1 6Y
p2 6Y C∈H(−4n)
p1 splits p2 splits C 2 =P (−4n)
π(Y ; −4n, C)π(Y ; −4n, C −1 ) + O(Y + ω(4n)2 ).
X
C∈K(−4n)
C 2 =C0 (−4n)
The error term is a trivial bound for the squares of primes and the contribution from ramified
primes in π(Y ; −4n, C) and π(Y ; −4n, C −1 ).
(2)
Let π2 (X; n) be the
√ number of pairs of primes p1 6 p2 6 Y with p1 p2 6 X such that both
p1 and p2 split in Q( −n) and both are representable by forms of non-ambiguous classes of
discriminant −4n; then
(2)
π2 (X; n) =
=
1 1
·
2 2
1
4
X
X
X
δ(C, p1 )δ(C −1 , p2 ) + O(Y )
p2 6Y C∈H(−4n)
p1 6Y
p1 splits p2 splits C 2 6=P (−4n)
X
C∈K(−4n)
C 2 6=C0 (−4n)
π(Y ; −4n, C)π(Y ; −4n, C −1 ) + O(Y + ω(4n)2 ).
(2.104)
68
The average distribution of primes represented by binary quadratic forms
The additional factor 21 in (2.104) stems from the existence of two distinct classes which contain
a prime ideal of norm p1 if the classes are non-ambiguous.
(3)
Let π2 (X; n) be the number √
of pairs of primes p1 6 p2 6 Y with p1 p2 6 X such that at least
one of these primes ramifies in Q( −n) and both are representable by forms of discriminant −4n;
(3)
then π2 (X; n) 6 ω(4n)Y 6 Y (log Y ).
Thus,
π2 (X; n) =
3
X
(i)
π2 (X; n) =
i=1
π(Y ; −4n, C)π(Y ; −4n, C −1 )
+ O(Y (log Y )).
2(3 − e(C))
C∈K(−4n)
X
We have |{C ∈ K(−4n) | C 2 = C0 (−4n)}| = 2ω(−4n)−1 = 2ω(n) (see Remark 1.3) and
therefore
X
X
h(−4n)
1
3
1 +
1.
+ 2ω(n)−3 =
4
8
4
C∈K(−4n)
C 2 =C0 (−4n)
C∈K(−4n)
C 2 6=C0 (−4n)
Thus,
X
n∈M2 (Q)
h(−4n)
li(Y ) 2 ω(n)−3
+2
π2 (X; n) −
4
h(−4n) X
6
X
n∈M2 (Q) C∈K(−4n)
C 2 =C0 (−4n)
+
X
X
n∈M2 (Q) C∈K(−4n)
C 2 6=C0 (−4n)
+
π(Y ; −4n, C)π(Y ; −4n, C −1 )
3
li(Y ) 2 −
2
8 h(−4n) (2.105)
π(Y ; −4n, C)π(Y ; −4n, C −1 )
1
li(Y ) 2 −
4
4 h(−4n) O(QY (log Y )).
By the triangle inequality, the first term on the right side is
6
X
X
n∈M2 (Q) C∈K(−4n)
C 2 =C0 (−4n)
π(Y ; −4n, C) − li(Y ) · π(Y ; −4n, C −1 ) − li(Y ) 2h(−4n)
2h(−4n) (2.106)
li(Y ) + 2 li(Y )
max
π(Y ; −4n, C) − 2h(−4n) .
C∈K(−4n)
n∈M2 (Q) 2
X
C =C0 (−4n)
Let B1 = B1 (A) be the constant from Theorem 2.16 and let B2 = 2A + 6 be the constant from
Theorem 2.29. Set B = max(B1 , B2 ). By Theorem 2.16, the second term of (2.106) is
Q1/2 Y 2 (log Y )−A
if Q10+ε 6 Y (log Y )−B . By the Cauchy–Schwarz inequality and Theorem 2.29, the first term
of (2.106) is
Q1/2 Y 2 (log Y )−A
in the same range.
69
2.5 Applications and open questions
The second term on the right side of (2.105) is
6
X
X
n∈M2 (Q) C∈K(−4n)
C 2 6=C0 (−4n)
+ 2 li(Y )
X
n∈M2 (Q)
π(Y ; −4n, C) − li(Y ) · π(Y ; −4n, C −1 ) − li(Y ) h(−4n)
h(−4n) max
C∈K(−4n)
C 2 6=C0 (−4n)
π(Y ; −4n, C) − li(Y ) .
h(−4n) By Theorem 2.16 and Theorem 2.29 this is again
Q1/2 Y 2 (log Y )−A
if Q10+ε 6 Y (log Y )−B . It finally remains to note that the error term on the right side of (2.105)
is negligible.
2.5.3
Possible extensions and generalizations
We would like to end this chapter by mentioning some potential extensions of the results which
we have proved in this chapter for primitive positive definite binary quadratic forms with fundamental discriminants q 6≡ 0 (mod 8).
(1) The restriction to fundamental discriminants and the congruence restriction q 6≡ 0 (mod 8)
seem to be the easiest to drop, but this would render certain parts of the proofs more
technical. Primarily, this restriction allowed us to not have to worry about odd prime
factors of the discriminants appearing more than once. Thus, we were able to find a bound
for the conductor of Rankin–Selberg convolutions of class group L-functions in Section 2.2
by a quite easy application of Li’s functional equation; odd square factors (and, to a lesser
extent, the factor 8) would have turned this into a rather messy affair or would have required
us to use much deeper results on local Langlands correspondence; see the last footnote in
the proof of Lemma 2.8. We also profited from the absence of square factors when using the
Kronecker Factorization Formula (2.70). However, an analogue of this formula is known to
hold for non-fundamental discriminants as well (see [Fog61, Lemma 4]), so that this part
of the proof could probably be adapted accordingly without much trouble.
(2) A rather different picture shows for primitive binary quadratic forms of positive discriminant q. The proof of the infinitude of primes which are represented by any such form is
essentially the same as for negative discriminants. But already the corresponding prime
number theorem is significantly more complicated: The first proof by de la Vallée Poussin
[dlVP97] was 92 pages long, even though he skipped the parts which were the same as
in his 35-pages-proof for negative discriminants (Landau was later able to simplify both
proofs but a notable difference in complexity persists).
The additional difficulty lies mainly in the presence of non-trivial units. Their number –
or rather their “density” – is hard to control, but it determines the regulator, which is a
supplementary factor in the class number formula for real quadratic fields. This in turn
makes it difficult to obtain good estimates for the class number and would obviously be
an obstacle to overcome when trying to find mean-value estimates for the error term in
the prime number theorem for such forms. Moreover, by the Cohen–Lenstra heuristics
(see [Coh93, §5.10]), these class numbers are conjectured to be usually very small and give
therefore presumably not much potential for cancellation effects in average results.
70
The average distribution of primes represented by binary quadratic forms
(3) Regarding the many similarities between the prime number theory for arithmetic progressions and the prime number theory for binary quadratic forms, it is natural to continue by
investigating primes lying simultaneously in given (families of) arithmetic progressions and
given (families of) binary quadratic forms. Dirichlet stated in 1840 and Meyer proved in
1888 that every positive definite binary quadratic form represents infinitely many primes
lying in a prescribed residue class of a given modulus provided that certain compatibility
conditions are fulfilled; de la Vallée Poussin proved a quantitative version of this statement
in the fifth part of his Recherches analytiques sur la théorie des nombres premiers (see
[Nar00, §2.2] for references).
One can imagine that it should be feasible to combine these old results for fixed progressions
and forms, the original mean-value results for arithmetic progressions and our variants for
binary quadratic forms.
(4) Leaving binary quadratic forms behind, a vast number of much more ambitious extensions
are conceivable. At the turn of the millennium some binary polynomials of higher degree
were proved to represent infinitely many primes. After Friedlander and Iwaniec [FI98] gave
a deep proof for the infinitude of primes of the form x2 + y 4 , their ideas were picked up and
advanced by Heath-Brown and Moroz who proved that each primitive irreducible binary
cubic form f (x, y) represents infinitely many primes unless f represents only even numbers;
see [HBM02] and the references there. The family of binary cubic forms would therefore
be a candidate for new “on average”-results.
There exist correspondences between classes of binary cubic forms and classes of cubic
rings, the Delone–Faddeev correspondence and the related Davenport–Heilbronn correspondence (see [BST13], for example); maybe they could be used in a similar way as the
correspondence between classes of binary quadratic forms and ideal classes that we have
used.
Moreover, we call to mind that one is not necessarily confined to families of forms or families
of polynomials whose members are already known to represent an infinitude of primes (but
which are still conjectured to do so) as can be seen from the mean-value results for the
family of polynomials x2 + n with n ∈ N that we mentioned at the end of Section 2.5.1.
(5) Apart from generalizing the forms under consideration, it would be also interesting to
strengthen our existing results for positive definite binary quadratic forms. The most
effective gain would certainly arise from a stronger large sieve inequality for complex class
group characters; see Remark 2.9(b).
Since there exist two essentially different ways to replace the classical Barban–Davenport–
Halberstam theorem by an asymptotic formula (see Section 2.1), one with and one without
appeal to the large sieve inequality, an analogous asymptotic formula for binary quadratic
forms poses an appealing follow-up challenge.
Ultimately, it would, of course, be most fascinating to see anything proved towards results
of Fouvry–Iwaniec/Bombieri–Friedlander–Iwaniec type (see Theorem 2.4) for primes of the
shape x2 + ny 2 . Regarding the depth of these results in their original form, this probably
is but a daydream for now.
Chapter 3
Chebyshev’s bias and prime number
races for binary quadratic forms
At the beginning of his famous 1837 memoir in which Dirichlet proved the existence of infinitely many primes in each arithmetic progression a (mod q) with (a, q) = 1, he mentioned the
observation that the number of primes p with p ≡ a (mod q) asymptotically equals the number
of primes p with p ≡ b (mod q) if (a, q) = 1 = (b, q).1 However, if all things were equal in all
arithmetic progressions, nothing would be prized, to paraphrase Thomas Hobbes; and therefore it soon got noticed that Dirichlet’s observed equality cannot be true in a stronger sense.
Chebyshev wrote in 1853 (see [MS99, §29]):
En cherchant l’expression limitative des fonctions qui déterminent la totalité des
nombres premiers de la forme 4n + 1 et de ceux de la forme 4n + 3, pris au-dessous
d’une limite très grande, je suis parvenu à reconnaître que ces deux fonctions diffèrent
notablement entre elles par leurs seconds termes, dont la valeur, pour les nombres
4n + 3, est plus grande que celle pour les nombres 4n + 1; [...].
Chebyshev goes on to give more specific assertions on this difference and he claimed to have
proofs, but he did not write any of them down. The first of these assertions states that, given
positive numbers X0 and δ, there exists X = X(X0 , δ) > X0 such that
π(X; 4, 1) − π(X; 4, 3)
√
− 1 < δ.
−1
X(log X)
It was only in 1891 when Phragmén gave the first correct proof of this statement (see [Lan06]
and the references there). Note that this result does not specify the direction of the “bias”
which Chebyshev mentioned earlier. His second claim states that
e−3c − e−5c + e−7c + e−11c − e−13c − e−17c + e−19c + e−23c − · · · → ∞
(3.1)
as c → 0. Landau [Lan18] proved that this statement would imply the non-existence of zeros
of L(s, χ4 ) to the right of the critical line (Hardy and Littlewood proved in the same year the
inverse implication).2
Chebyshev’s remark remained a conundrum and it was only in 1959 when Shanks [Sha59]
found a way to give an appropriate formulation of the bias. In fact, compared to the enormous
1
As we have mentioned in Section 2.1, it took another 59 years before de la Vallée Poussin actually proved
that π(X; q, a) ∼ π(X; q, b) as X → ∞.
2
Landau aptly notes that this implication “erhöht für Ungläubige die Wahrscheinlichkeit, daß Tschebyschef
sich geirrt hat, und für Gläubige den Wunsch, aus seinen Papieren den Beweis von [(3.1)] rekonstruiert zu sehen.”
72
Chebyshev’s bias and prime number races for binary quadratic forms
advances in the study of uniformities in the distribution behaviour of primes lying in various
residue classes of a given modulus during the first half of the 20th century, the study of the
discrepancies in this distribution behaviour received little to no attention until the 1960s when
Knapowski and Turán coined the term comparative prime number theory for this field and
systematically investigated many of the problems concerning this subject in two series of papers.
Yet, the field mostly stagnated in the subsequent years and it was only in the 1990s that it
was revived, primarily by the work of Rubinstein and Sarnak [RS94]. Building on their paper,
comparative prime number theory has enjoyed a resurgence in popularity in recent years and
results in this field now trade under the illustrious names Chebyshev’s bias, Prime Number Races
and the Shanks–Rényi race.
In the next section, we will give a short review of these recent advances in comparative prime
number theory for arithmetic progressions. Then we will move on to primes represented by
binary quadratic forms. Ng has shown that a bias exists between certain ideal classes of a given
fixed imaginary quadratic field; this may be interpreted as a bias between form classes of a given
fixed fundamental discriminant. We will review some of his results in Section 3.2. Finally, the
overall theme of this thesis, the distribution of primes of the shape x2 + ny 2 for various positive
integers n, will be taken up in Section 3.3. We will prove that certain even negative fundamental
discriminants are put at a disadvantage in their capability to represent primes by their principal
form – in a way that is not directly apparent from the corresponding prime number theorem,
which we have seen in Chapter 1, and that contrasts with the uniformity results of Chapter 2.
Our restriction to fundamental discriminants reduces the amount of technical details in our
analysis, but we expect similar results to hold for non-fundamental discriminants as well and
therefore for all pairs (n, m) of positive integers for which −4n and −4m have the same class
number. We close this chapter by giving a list of possible lines of further research.
3.1
Bias in the distribution of primes in arithmetic progressions
It would hardly be possible to give a better modern-day introduction to the field of comparative
prime number theory than the survey articles [GM06] and [FK02]. We will not endeavour to do
so, but will only give an overview of some of the most important recent results and especially
the definitions and assumptions which our own results in Section 3.3 will also heavily rely on.
A natural generalization of Chebyshev’s observation suggests the following setting: Fix an
integer q > 3 and a pair (a1 , a2 ) of distinct positive integers less than q which are both coprime
to q. One considers the properties of the set
P (q; a1 , a2 ) = {X > 2 | π(X; q, a1 ) > π(X; q, a2 )}
and wants to see whether P (q; a1 , a2 ) ∩ [2, X] is “usually” (in a suitable sense) larger than the
set P (q; a2 , a1 ) ∩ [2, X]. That is, one aims to find out if there is a bias in the “race” of the prime
counting functions π(X; q, a1 ) and π(X; q, a2 ).
Thus, Chebyshev’s original remark can be expressed as the observation that
P (4; 3, 1) ∩ [2, X] > P (4; 1, 3) ∩ [2, X]
for “most” X > 2. To describe this asymmetry rigorously, one has to fix the right notion of
density on N. The asymptotic density is not appropriate for problems of this kind since, subject
to the Generalized Riemann Hypothesis, it is known that this density does not even exist for
the sets P (q; a1 , a2 ) (see the references to the works of Kaczorowski and Sarnak in [GM06]).
3.1 Bias in the distribution of primes in arithmetic progressions
73
However, Rubinstein and Sarnak [RS94] proved that the logarithmic density
1
X→∞ log X
Z
δ(q; a1 , a2 ) = lim
[2,X]∩P (q;a1 ,a2 )
dt
t
always exists and that it is always positive3 – subject to the following two hypotheses:
• the Generalized Riemann Hypothesis for Dirichlet characters modulo q (GRHq );
• a linear independence hypothesis, which they call the Grand Simplicity Hypothesis (GSHq ):
The set of all non-negative imaginary parts of zeros of all Dirichlet L-functions L(s, χ),
with χ running over the primitive Dirichlet characters modulo q, is linearly independent
over Q.
According to [RS94, §1] and [FK02, §3], the consequences of similar linear independence assumptions were already investigated by Wintner and Ingham in the 1930s and 1940s (for the
zeros of the Riemann zeta-function) and by Hooley and Montgomery in the 1970s and 1980s
(for the zeros of Dirichlet L-functions). For some results on prime number races one can dispense with the use of the (full) Grand Simplicity Hypothesis, but it seems to be very difficult to
obtain results independent of the Generalized Riemann Hypothesis. For instance, in [FLK13]
it is shown how certain hypothetical sets of zeros of Dirichlet L-functions lying off the critical
line may cause P (q; a1 , a2 ) to have asymptotic density 0. This would be a stark contrast to the
results of Rubinstein and Sarnak.
Let us come back to their paper: They investigate configurations (q; a1 , a2 ) as above for
which the logarithmic density deviates from the expected value 21 , i.e. configurations for which
a bias exists. The superficial reason for the existence of such a bias lies in the difference between
the prime counting function π(X; q, a) and the corresponding Chebyshev function ψ(X; q, a);
the latter function does not have a similar bias as the former count. In particular, the absence
of the squares of primes in π(X; q, a) may cause a bias: Under the assumption of GRHq the
contribution of the squares of primes is roughly of the same size as the error term in the explicit
formula for ψ(X; q, a); this contribution, of course, only exists if a is a square modulo q. Indeed,
it is shown in [RS94] that the strict inequalities
0 < δ(q; a1 , a2 ) <
1
2
< δ(q; a2 , a1 ) < 1
(3.2)
hold if a1 is a square modulo q and a2 is not a square modulo q. In particular, this explains
Chebyshev’s original observation. Moreover, one has δ(q; a1 , a2 ) = 21 = δ(q; a2 , a1 ) if both a1
and a2 are quadratic residues modulo q or if both are quadratic non-residues modulo q.
As indicated above, the starting point of the proof of (3.2) is the so-called explicit formula
for the Chebyshev function ψ(X; q, a), which relates this function with the zeros of the Dirichlet
L-functions L(s, χ) for the Dirichlet characters χ modulo q. This formula simplifies considerably
on the assumption of GRHq and they prove that the vector-valued function
log X
E(X; q; a1 , a2 ) = √
ϕ(q)π(X; q, a1 ) − π(X), ϕ(q)π(X; q, a2 ) − π(X)
X
3
They actually consider a more general situation: Let (a1 , . . . , ar ) be an r-tuple of distinct positive integers
less than q which are all coprime to q. Set
P (q; a1 , . . . , ar ) = {X > 2 | π(X; q, a1 ) > · · · > π(X; q, ar )}
and find sets of this type which are usually larger (in the sense of a larger logarithmic density) than other sets
P (q; aσ(1) , . . . , aσ(r) ) when σ is a permutation of {1, . . . , r}. We will ignore these interesting generalizations as
they are not relevant for our results in Section 3.3.
74
Chebyshev’s bias and prime number races for binary quadratic forms
has a limiting (logarithmic) distribution on R2 , i.e. there exists a probability measure µq;a1 ,a2
on R2 such that
1
X→∞ log X
Z X
g(E(t; q; a1 , a2 ))
lim
2
dt
=
t
Z
R2
g(x) dµq;a1 ,a2 (x)
for all bounded continuous functions g on R2 . In order to relate µq;a1 ,a2 and δ(q; a1 , a2 ) the
assumption GSHq is needed; it allows Rubinstein and Sarnak to show that µq;a1 ,a2 is absolutely
continuous and so δ(q; a1 , a2 ) = µq;a1 ,a2 {(x1 , x2 ) ∈ R2 | x1 > x2 } . Their explicit formulas
for µq;a1 ,a2 and, importantly, the knowledge of the exact position of the zeros of Dirichlet
L-functions up to some height then also make it possible to compute numerically the densities
δ(q; a1 , a2 ) for some values of q, a1 and a2 (in particular, they get δ(4; 3, 1) ≈ 0.996, showing
that the bias in Chebyshev’s original prime number race is indeed very pronounced).
√ 1 ,a2 ) converges in measure to
Moreover, they show that the limiting distribution of E(X;q;a
log q
2
2
the Gaussian (2πex1 +x2 )−1 dx1 dx2 as q → ∞; this leads to the asymptotic behaviour
max δ(q; a1 , a2 ) →
a1 ,a2
1
2
as q → ∞, i.e. any bias dissolves as q grows.
We only mention one of the many recent papers which build on the work of Rubinstein and
Sarnak as we have chosen some of its results as a template for our own results in Section 3.3:
Fiorilli and Martin [FM13] broke new ground by attaining an asymptotic formula for δ(q; a1 , a2 )
which can be exactly evaluated as a finite expression of arithmetic (rather than analytic) information – without explicit use of the position of zeros on the critical line. Therefore, extensive
numerical computations can be omitted and most of the results of Rubinstein and Sarnak can be
calculated in a more precise and effective way. We will come back to their work in Section 3.3.
3.2
Primes represented by different classes of forms with a fixed
discriminant
Similar to the extension or generalization of the large sieve inequality, the Bombieri–Vinogradov
type results and the Barban–Davenport–Halberstam type results to algebraic number fields,
which we have mentioned in Section 2.1, it is natural to extend the notion of Chebyshev’s bias
or prime number races to algebraic number fields. This was indicated by Rubinstein and Sarnak
in [RS94, §5] and then carried out in great detail by Ng in [Ng00]. In particular, he considered
discrepancies in the distribution of unramified prime ideals in distinct conjugacy classes of a
fixed Galois group, i.e. discrepancies in the dominion of the Chebotarev density theorem: Let
K be a fixed number field and let L/K be a normal extension; let G be its Galois group. For
each conjugacy class C of G, we set
enorm (X; L/K, C) =
π
|G|
e (X; L/K, C),
·π
|C|
e (X; L/K, C) was defined in the statement of the Chebotarev density theorem,
where π
enorm (X; L/K, C) ∼ logXX
Theorem 1.14. This theorem also yields the asymptotic relation π
as X → ∞. For any two distinct conjugacy classes C1 and C2 of G, Ng defines the bias sets
enorm (X; L/K, C1 ) > π
enorm (X; L/K, C2 )} ,
P (L/K; C1 , C2 ) = {X > 2 | π
then constructs limiting distributions attached to these sets and computes explicit logarithmic
densities for some specific examples of sets P (L/K; C1 , C2 ).4
4
In fact, more general results are obtained for r-tuples (C1 , . . . , Cr ) of distinct conjugacy classes of G.
3.2 Primes represented by different classes of forms with a fixed discriminant
75
√
It is of particular interest to us that he analyses in detail the case when K = Q( q), where
q is a fixed negative (fundamental) discriminant, and L is the Hilbert class field of K; as we
have mentioned in Section 1.3, the ideal class group H(q) and the Galois group Gal(L/K)
e (X; q)
are isomorphic. To state his results, some additional definitions are necessary: Let π
e (X; q, C) be the numbers of prime ideals p with N(p) 6 X in O(q) and in C ∈ H(q),
and π
respectively. For any two ideal classes C1 , C2 ∈ H(q), we define the normalized functions
log X
e (X; q, C1 ) − π
e (X; q), h(q)π
e (X; q, C2 ) − π
e (X; q) ,
Eq (X; C1 , C2 ) = √
h(q)π
X
log
X
e (X; q, C1 ) − π
e (X; q, C2 ) .
π
Eq0 (X; C1 , C2 ) = √
X
From the conditional explicit formula for the Chebyshev function associated to ideal class group
characters (Theorem 1.18), one quickly derives that
Eq0 (X; C1 , C2 ) =
|%−1 (C2 )| − |%−1 (C1 )|
1
−
h(q)
h(q)
X
(χ(C1 ) − χ(C2 ))
(q)
b
χ∈H(q)r{χ
0 }
+ Oq
X 1/2 (log T )2
1
+
T
log X
X
1
|γχ |6T 2
X iγχ
+ iγχ
!
,
where % : H(q) → H(q), C 7→ C 2 . The first term on the right-hand side accounts for the bias:
It is 0 if both C1 and C2 are in the image of % (or if none of them is) and it is
κ(q)
|%−1 (C2 )| − |%−1 (C1 )|
=
h(q)
h(q)
if C2 is in the image of % and C1 is not in the image of %; recall that κ(q) denotes the number
of ambiguous classes in H(q) (see (1.2)). Thus, by Remark 1.3, an ideal class whose corresponding form class (via the correspondence of Lemma 1.4) belongs to the principal genus will
be discriminated against in a prime ideal race with any ideal class whose corresponding form
class lies in a different genus. Clearly, the presumable bias is the more pronounced the smaller
the odd part H(q)/(H(q))2 of the class group is.
Ng also proves – under the assumption of the Generalized Riemann Hypothesis and a suitable linear independence hypothesis – that a limiting distribution µ of Eq (X; C1 , C2 ) exists.
Moreover, the probability measure on R2 whose Fourier transform equals an appropriately normalized form of the Fourier transform of µ converges, as q → −∞, in measure to a Gaussian
which is independent of the classes C1 and C2 . It then follows that biases between different ideal
classes disappear as |q| grows – similarly to the above-mentioned behaviour of prime number
races for residue classes modulo q.
Remark 3.1. Using the relation between prime ideals in imaginary quadratic fields and primes
represented by binary quadratic forms, which we described in Section 1.2, one can translate
these results into statements on prime number races for primes represented by forms of either of
two distinct form classes of the same discriminant. Some caution has to be exercised whenever
e (X; q, P (q)) = π
e (X; q, Bq (C0 )) and π
e (X; q) do
the principal class C0 enters the race: Both π
not only contain prime ideals which correspond (via Lemma 1.4) to primes represented by the
principal form or by any form of discriminant q, respectively: They usually contain prime ideals
which lie over rational primes that remain prime in O(q), too. Note that this cannot happen
for other form classes; compare Remark 3.7. This disparity distorts the bias results that follow
here for primes represented by forms of distinct form classes but the same discriminant. We
will not particularize the results as doing so would make too large a digression from the main
theme of the thesis. The analysis that would be necessary is, however, very similar to the one
that we will perform in the upcoming Section 3.3.
76
3.3
Chebyshev’s bias and prime number races for binary quadratic forms
Prime number races for forms of the shape x2 + ny 2
After having studied questions on uniformity in the distribution of primes represented by binary
quadratic forms with varying discriminants in Chapter 2, we will now throw light on discrepancies in this distribution. The works that we have discussed in the preceding two sections fix the
modulus of an arithmetic progression or the discriminant of an imaginary quadratic field before
studying the existence of a bias between two different residue classes to this same modulus or
between two different ideal classes in this same field (and therefore, to some extent, between
two different form classes of the same discriminant). The only results that consider varying
moduli or varying discriminants are the ones which show that the maximal bias in the prime
number races – between quadratic residues and non-residues or between ideal classes whose
corresponding form classes lie inside and outside the principal genus, respectively – decreases
as the absolute value of the modulus or the discriminant grows.
We will investigate here what happens when we fix the form class and then compare prime
number races between this class for two distinct discriminants. In fact, we will concentrate on
the principal class as this is the only case in which “fixing the class” seems to be a meaningful
notion.
The analogous problem for primes in arithmetic progressions would be the following: Let q1
and q2 be two distinct positive integers with ϕ(q1 ) = ϕ(q2 ). How is the difference
π(X; q1 , 1) − π(X; q2 , 1)
distributed? It seems that this question has drawn much less direct attention than the problems
discussed in Section 3.1. The only work known to the author which mentions this kind of
problem explicitly is the second paper [KT64] of the “Further developments in the comparative
prime-number theory”-series of Knapowski and Turán (a shorter remark also appears in the
first paper of the series). In the appendix of this paper, they suggest the investigation of the
distribution of the difference
π(X; q1 , a1 ) − π(X; q2 , a2 )
for integers q1 , q2 , a1 and a2 with (a1 , q1 ) = 1 = (a2 , q2 ) and ϕ(q1 ) = ϕ(q2 ). They attribute
this question to G. Lorentz. In some cases such problems can be directly reduced to prime
number races for distinct residue classes of a fixed modulus: If q1 = 3 and q2 = 4 (in particular:
ϕ(q1 ) = ϕ(q2 )) and a1 = 2 and a2 = 1, we have
π(X; 3, 2) − π(X; 4, 1)
= p 6 X | p ≡ 2, 5, 8 or 11 (mod 12) − p 6 X | p ≡ 1, 5 or 9 (mod 12) = 1 + π(X; 12, 11) − π(X; 12, 1).
Thus, the race between primes of the form 3n + 2 and primes of the form 4n + 1 reduces to the
race between primes of the form 12n + 11 (together with a handicap coming from the prime 2)
and primes of the form 12n + 1. Since 1 is a quadratic residue and 11 is a quadratic non-residue
modulo 12 we expect a bias towards the first arithmetic progression by the results mentioned in
Section 3.1 (the resulting bias will only be slightly changed by the extra summand 1). However,
if q1 = 7, q2 = 9 and a1 = a2 = 1, for example, then
π(X; 7, 1) − π(X; 9, 1) = π(X; 63, 29) + π(X; 63, 43) − π(X; 63, 19) − π(X; 63, 37).
Thus, the union of the residue classes 29 and 43 modulo 63 competes against the union of
the residue classes 19 and 37. Note that 37 and 43 are quadratic residues while 19 and 29
are quadratic non-residues modulo 63. The results of Section 3.1 cannot be directly applied
3.3 Prime number races for forms of the shape x2 + ny 2
77
to this “union-problem” (as it is called in the papers of Knapowski and Turán), but it is
entirely conceivable that the methods of Rubinstein–Sarnak and Fiorilli–Martin could be used
to investigate this problem.
Coming back to primes represented by binary quadratic forms, or specifically to primes of
the shape x2 + ny 2 for various positive integers n, the purpose of this section will be to study
the distribution of
∆(X; n, m) := π0 (X; n) − π0 (X; m)
for positive squarefree integers n and m with n, m 6≡ 3 (mod 4), where
X
π0 (X; n) := π(X; −4n, C0 ) =
1.
p6X
∃x,y∈Z: p=x2 +ny 2
That is, we want primes of the shape x2 + ny 2 to compete against primes of the shape x2 + my 2
and we assume h(−4n) = h(−4m) to allow for a reasonably fair race: By the prime number
theorem for these forms, Theorem 1.2 or Theorem 1.12, this assumption yields the asymptotic
relation
π0 (X; n) ∼ π0 (X; m)
as X → ∞. Thus, a priori there is no reason to expect that ∆(X; n, m) has a preference for
either positive or negative values. Nevertheless, we will see that a difference in the number of
odd prime divisors of n and m causes a bias. This phenomenon is visible from the graphs of
∆(X; n, m), which we have plotted – using PARI/GP and gnuplot – for the exemplary pairs
(n, m) = (17, 21), (33, 34), (1201, 1365) and (14, 17) and X 6 109 .
Prime number race: x2+17y2 vs. x2+21y2
1600
1400
1200
1000
800
600
400
200
0
-200
-400
-600
0
1
2
3
4
5
6
7
8
9
10
x 108
Figure 3.1: Graph of X 7→ ∆(X; 17, 21) = π0 (X; 17) − π0 (X; 21) for X 6 X0 = 109 . The discriminants
of x2 + 17y 2 and x2 + 21y 2 both have class number 4. We have an apparent bias for 17,
R the number
with fewer (odd) prime factors: The ratio of the truncated logarithmic densities log1X0 X∈Si dt
t of the
sets S1 = {X 6 X0 | ∆(X; 17, 21) > 0} and S2 = {X 6 X0 | ∆(X; 17, 21) < 0} is about 6.6. The ratio
i ∩Z}|
of the corresponding truncated natural densities |{X∈S
is about 6.4.
X0
78
Chebyshev’s bias and prime number races for binary quadratic forms
Prime number race: x2+33y2 vs. x2+34y2
600
400
200
0
-200
-400
-600
-800
-1000
-1200
-1400
-1600
0
1
2
3
4
5
6
7
8
9
10
x 108
Figure 3.2: Graph of X 7→ ∆(X; 33, 34) = π0 (X; 33) − π0 (X; 34) for X 6 X0 = 109 . The discriminants
of x2 + 33y 2 and x2 + 34y 2 both have class number 4. We have an apparent bias for 34, the number with
fewer odd prime factors: The ratio of the truncated logarithmic densities of {X 6 X0 | ∆(X; 33, 34) > 0}
1
and {X 6 X0 | ∆(X; 33, 34) < 0} is about 4.2
. The ratio of the corresponding truncated natural densities
1
is about 5.7 .
Prime number race: x2+1201y2 vs. x2+1365y2
1000
800
600
400
200
0
-200
-400
-600
0
1
2
3
4
5
6
7
8
9
10
x 108
Figure 3.3: Graph of X 7→ ∆(X; 1201, 1365) = π0 (X; 1201) − π0 (X; 1365) for X 6 X0 = 109 . The
discriminants of x2 + 1201y 2 and x2 + 1365y 2 both have class number 16. We have a (visually slightly
less) apparent bias for 1201, the number with fewer (odd) prime factors: The ratio of the truncated
logarithmic densities of {X 6 X0 | ∆(X; 1201, 1365) > 0} and {X 6 X0 | ∆(X; 1201, 1365) < 0} is
about 9.3. The ratio of the corresponding truncated natural densities is about 2.0.
3.3 Prime number races for forms of the shape x2 + ny 2
79
Prime number race: x2+14y2 vs. x2+17y2
1500
1000
500
0
-500
-1000
0
1
2
3
4
5
6
7
8
9
10
x 108
Figure 3.4: Graph of X 7→ ∆(X; 14, 17) = π0 (X; 14)−π0 (X; 17) for X 6 X0 = 109 . The discriminants of
x2 + 14y 2 and x2 + 17y 2 both have class number 4. There is no apparent bias. The ratio of the truncated
1
logarithmic densities of {X 6 X0 | ∆(X; 14, 17) > 0} and {X 6 X0 | ∆(X; 14, 17) < 0} is about 1.3
.
The ratio of the corresponding truncated natural densities is about 1.06.
It does not seem to be possible to reduce this problem to a “union-problem” for a fixed
discriminant like in the analogous race for primes in arithmetic progressions. Nevertheless, we
will follow certain parts of the papers of Rubinstein and Sarnak [RS94] and Fiorilli and Martin
[FM13] and see how their proofs and results can be used in our context.
Beside providing an appealing contrast to the uniformity results of Chapter 2, we hope that
this investigation will also spark further interest in the original question by Knapowski, Turán
and Lorentz.
Remark. Moree and te Riele [MtR04] considered the following race for all integers (not only
primes) represented by the forms x2 + y 2 or x2 + 3y 2 : Let
Bn (X) = {k 6 X | k = x2 + ny 2 for some x, y ∈ Z}
for n = 1 and n = 3. The constant in (1.13) can be made explicit: As X → ∞, we have
Bn (X) ∼ b(−4n) √
X
log X
with
b(−4) =
1
Y
q
p≡3 (mod 4)
2 1−
p−2
≈ 0.764
and
b(−12) =
Y
p≡2 (mod 3)
q √
1
2 3 1 − p−2
≈ 0.639.
Thus, this is not a fair race to begin with and it is clear that B3 (X) will gather an increasing
deficit in the long run. However, Moree and te Riele prove that B1 (X) actually never relinquishes
the leadership after taking the pole position with 2 = 12 + 12 . Their proof relies, of course,
on numerical computations, but involves an interesting underlying analytic toolbox, which they
developed in an earlier paper on Chebyshev’s bias for composite numbers with restricted prime
divisors.
80
Chebyshev’s bias and prime number races for binary quadratic forms
3.3.1
Definitions and statement of the results
For ease of notation and in order to put ourselves in a good position to use the methods in [RS94]
and [FM13] in as straightforward a way as possible, we will consider the function π̆0 (X; n) that
counts the primes of the shape x2 + ny 2 twice: For all X > 2 and all positive squarefree integers
n 6≡ 3 (mod 4), we set
π̆0 (X; n) = 2 π0 (X; n),
X
π̆(X; n) = 2
1;
p6X
χ−4n (p)=1
note that, by Proposition 1.6 and Remark 1.9, the second function counts (twice) all primes
p 6 X which do not divide n and which can be represented by some binary quadratic form
of discriminant −4n. Furthermore, if m 6≡ 3 (mod 4) is also a positive squarefree integer with
h(−4n) = h(−4m), we define the bias set
P (n, m) = {X > 2 | π̆0 (X; n) > π̆0 (X; m)} = {X > 2 | π0 (X; n) > π0 (X; m)},
the corresponding logarithmic density
1
δ(n, m) = lim
X→∞ log X
Z
[2,X]∩P (n,m)
dt
,
t
and the normalized bias functions
log X
√
· h(−4n)π̆0 (X; n) − π̆(X; n) ,
X
log X
E(X; n, m) = √
· h(−4n) · π̆0 (X; n) − π̆0 (X; m)
X
log X
= E(X; n) − E(X; m) + √
· π̆(X; n) − π̆(X; m) .
X
E(X; n) =
(3.3)
Definition 3.2. For any two positive squarefree integers n and m with n, m 6≡ 3 (mod 4) and
h(−4n) = h(−4m), we will need the usual assumptions (see Section 3.1) in the following form:
• Special cases of the Generalized Riemann Hypothesis (GRHn,m ): We assume
√ that all
non-trivial
zeros
of
the
Dedekind
zeta-functions
of
the
Hilbert
class
fields
of
Q(
−n) and
√
Q( −m) have real part equal to 12 . (Note that, by Remark 1.17 and the Kronecker Factorization Formula (2.70), this assumption implies that the Dirichlet L-function L(s, χd ) has
−4m
no non-trivial zeros lying off the critical line if d and −4n
d or d and d are fundamental
discriminants.)
b 0 (−4k) be
• A linear independence hypothesis (LIn,m ): For both k = m and k = n, let H
1
b 0 (−4k)
any maximal set of complex class group characters of H(−4k) such that χ ∈ H
1
0
0
b (−4k); moreover, let H
b (−4k) be the set of real class group characters of
implies χ 6∈ H
1
2
H(−4k). Let F± denote the set of all (positive or negative) fundamental discriminants.
Set
n
o
−4n
−4m
D(n, m) = d ∈ F± :
∈ F± and
∈ F± ,
d
d o n
n
o
b 0 (n, m) = χ
H
−4n/d , χ−4m/d : d ∈ D(n, m) r χd : d ∈ D(n, m) ,
3
n
o
b 0 (n, m) = χ
b0
b0
H
d,−4n/d ∈ H2 (−4n), χd,−4m/d ∈ H2 (−4m) : d ∈ D(n, m) .
4
3.3 Prime number races for forms of the shape x2 + ny 2
81
We assume that the multiset of the non-negative imaginary parts of all non-trivial zeros
of all L-functions associated to the class group characters and Dirichlet characters in
b 0 (−4n) ∪ H
b 0 (−4m) ∪ H
b 0 (−4n) ∪ H
b 0 (−4m) ∪ H
b 0 (n, m) r H
b 0 (n, m)
H
1
1
2
2
3
4
(3.4)
is linearly independent over Q; in particular, we assume that all elements of this multiset
have multiplicity one.
b 0 (n, m) of real class group characters in (3.4) is
Remark 3.3. The exclusion of the set H
4
necessary as the Kronecker Factorization Formula shows that the linear independence hypothesis
b 0 (n, m) will cancel out in the key
could not hold otherwise. The multiple zeros which arise from H
4
equation (3.11), which justifies the exclusion; however, we have to add the Dirichlet characters
b 0 (n, m) since the zeros of the associated L-functions remain after this cancellation.
lying in H
3
Furthermore, we remark that Proposition 1.6 implies L(s, λχ ) = L(s, λχ ) for all s ∈ C
b
b
b 0 (−4n) and
and all χ ∈ H(−4n)
∪ H(−4m).
This makes it necessary to consider the sets H
1
0
b
H1 (−4m), which contain a representative character for each pair of conjugate complex characters, instead of the respective full sets of complex class group characters, in order to have a
fighting chance of linear independence. Note that the results below do not depend on the choice
of the representatives as we only use information about the corresponding L-functions.
We also note that the assumption LIn,m implies that none of the L-functions that are
associated to the class group and Dirichlet characters in (3.4) has a zero at s = 21 . Blomer
[Blo04b] has proved an unconditional upper bound on the proportion of class group L-functions
that can vanish there and Ng [Ng00, §5.1 and §6.2] has investigated the effect of central zeros
on the generalized races that we mentioned in Section 3.2.
We will first prove that E(X; n, m) has a limiting distribution. In fact, we will show that
this distribution is related to the distribution of the following type of random variables: For each
real number γ, let Zγ denote a random variable that is uniformly distributed on the unit circle
such that the set {Zγ }γ>0 is independent and Z−γ = Zγ ; let Yγ denote the random variable
that is given by the real part of Zγ .
Theorem 3.4. Let n and m be two distinct positive squarefree integers with n, m 6≡ 3 (mod 4)
and h(−4n) = h(−4m). Assume that GRHn,m holds. For k ∈ {n, m}, let ω 0 (k) denote the
number of odd prime divisors of k. For all characters χ in
b
b
b
b 0 (n, m) r H
b 0 (n, m)
H(n,
m) := H(−4n)
∪ H(−4m)
∪H
3
4
set
(
m(χ) =
b
b
if χ is a complex character in H(−4n)
or H(−4m),
otherwise.
2
1
Define
0
0
X
Y (n, m) = 2ω (m) − 2ω (n) + 2
q
b
χ∈H(n,m)
γ>0
L(1/2+iγ,χ)=0
and
X
V (n, m) =
b
χ∈H(n,m)
Yγ
X
X
m(χ)
γ∈R
L(1/2+iγ,χ)=0
1
4
1
4
+ γ2
1
.
+ γ2
(3.5)
Then the bias function E( · ; n, m), as defined in (3.3), has a limiting (logarithmic) distribution
on R, i.e. there exists a probability measure µn,m on R such that
1
X→∞ log X
Z X
g(E(t; n, m))
lim
2
dt
=
t
Z ∞
−∞
g(t) dµn,m (t)
82
Chebyshev’s bias and prime number races for binary quadratic forms
for all bounded continuous functions g on R. Moreover, if we additionally assume LIn,m , then
E( · ; n, m) has the same limiting distribution as the random variable Y (n, m) and its variance
is V (n, m).
This corresponds to Theorem 1.1 in [RS94] combined with Proposition 2.6 and Proposition 2.7
in [FM13].
From Theorem 3.4 we can then derive an explanation for the apparent bias phenomenon in
our prime number races as well as an asymptotic expression for the logarithmic density δ(n, m):
Theorem 3.5. Let n and m be as in Theorem 3.4. Assume that the hypotheses GRHn,m and
LIn,m hold. Then the logarithmic density δ(n, m) exists. It is greater than 21 if and only if
ω 0 (n) < ω 0 (m). Moreover,
0
δ(n, m) =
0
0
0
(2ω (m) − 2ω (n) )3
1 2ω (m) − 2ω (n)
+O
+ p
2
V (n, m)3/2
2πV (n, m)
(3.6)
and the asymptotic behaviour of the variance in the denominators is given by
V (n, m) ∼ 4h(−4n) log(nm)
(3.7)
as n, m → ∞. Thus, the “bias term” in (3.6) is Oε ((nm)−1/8+ε ) and the error term is
Oε ((nm)−3/8+ε ) for all ε > 0.
These results correspond to Remark 1.3, the ensuing symmetry analysis and one part of
Proposition 3.1 in [RS94] as well as Theorem 1.1 and Proposition 3.6 in [FM13]. They
immediately imply:
Corollary 3.6. For all integers n and m as in Theorem 3.4, with n and m having a distinct
number of odd prime divisors, there exists – subject to the assumptions GRHn,m and LIn,m – a
“bias” in the “prime number race”
n
p 6 X | ∃x, y ∈ Z : p = x2 + ny 2
o
versus
n
p 6 X | ∃x, y ∈ Z : p = x2 + my 2
o
towards the contestant corresponding to the integer n or m with the smaller number of odd
prime divisors. The “bias” dissolves as n and m grow.
This result provides an explanation of the biases that show in Figures 3.1–3.3 as well as the
lack of an apparent bias in Figure 3.4. However, a more explicit result would be necessary to
quantify the biases by means of explicit values for the respective logarithmic densities.
3.3.2
Proofs
We will closely follow the corresponding proofs in [RS94] and [FM13]. In fact, we will soon
see that, after some initial preparation and adaptation, the proofs in these papers may be used
almost verbatim. Thus, we will mostly only point out the differences that arise.
Let X > 2 and fix two positive squarefree integers n and m with n, m 6≡ 3 (mod 4) and
h(−4n) = h(−4m), i.e. such that −4n and −4m are two negative fundamental discriminants
with the same class number. As in Chapter 2, we will again capitalize on the relationship
between ideal classes and form classes. For this purpose we define
e0 (X; n) =
π
X
1,
p∈P (−4n)
N(p)6X
e (X; n) =
π
X
p∈Z(−4n)
N(p)6X
1.
3.3 Prime number races for forms of the shape x2 + ny 2
83
Note that we have
1
2
e0 (X; n) = π̆0 (X; n) +
π
X
(1 − χ−4n (p)) + O(1),
p6X 1/2
e (X; n) = π̆(X; n) +
π
1
2
X
(1 − χ−4n (p)) + O(1 + log n).
(3.8)
p6X 1/2
Remark 3.7. We recall that, by Remark 1.9, the second terms on the right sides of (3.8)
account for the prime ideals which lie over rational primes that remain prime in O(−4n); their
norm is the square of a prime. These ideals are the prime ideals which do not correspond to
primes that can be represented by a binary quadratic form of discriminant −4n. Note that
these prime ideals cannot lie in any other ideal class than the principal one. The remainder
terms in (3.8) arise from prime ideals which lie over ramified primes.
The corresponding normalized bias functions are
log X √
e0 (X; n) − π
e (X; n) ,
h(−4n)π
X
log
X
e
e0 (X; n) − π
e0 (X; m) .
· h(−4n) · π
E(X;
n, m) = √
X
e
E(X;
n) =
b
For all class group characters χ ∈ H(−4n),
we also set
X
e
ψ(X;
χ) =
e
χ(a)Λ(a),
a∈Z(−4n)
N(a)6X
ψe0 (X; n) =
1
h(−4n)
X
e
ψ(X;
χ),
b
χ∈H(−4n)
e is the function that we have defined in (2.36). Thus, we have
where Λ
ψe0 (X; n) =
X
X
`>1
p∈Z(−4n)
p` ∈P (−4n), N(p)` 6X
X
=
log(N(p))
X
log(N(p))
p∈C
C∈H(−4n)
C 2 =P (−4n) N(p)6X 1/2
p∈P (−4n)
N(p)6X
+
X
log(N(p)) +
X
X
`>3
p∈Z(−4n)
p` ∈P (−4n), N(p)` 6X
log(N(p)).
It is the second term on the right side that will account for the bias – similar to the cases in
Sections 3.1 and 3.2. Recall that κ(−4n) denotes the number of ambiguous classes in H(−4n)
0
and since n 6≡ 3 (mod 4) and n is squarefree, we see from (1.2) that κ(−4n) = 2ω (n) , where
ω 0 (n) is the number of odd prime divisors of n. Thus, for all ε > 0,
ψe0 (X; n) =
X
p∈P (−4n)
N(p)6X
0
log(N(p)) + 2ω (n)
X 1/2
+ Oε X 1/3 (log X)nε
h(−4n)
0
(3.9)
by the conditional prime ideal theorem (1.15) and the bound 2ω (n) 6 τ (n) ε nε . Using the
e
conditional explicit formula for ψ(X;
χ), Theorem 1.18, instead of [RS94, (2.1)] and the general
84
Chebyshev’s bias and prime number races for binary quadratic forms
asymptotic formula for the number of zeros of L-functions, Theorem 5.8 in [IK04], instead of
[RS94, (2.4)], we deduce (as in [RS94, Lemma 2.1])
0
e
E(X;
n) = −2ω (n) + 1 +
e
ψ(X;
χ)
nε
√
+ Oε
log X
X
X
b
χ∈H(−4n)
(−4n)
χ6=χ0
by partial summation from (3.9). Switching back from prime ideals to prime numbers, this
equation and the equations in (3.8) imply that
0
X
E(X; n) = − 2ω (n) + 1 +
b
χ∈H(−4n)
(−4n)
χ6=χ0
(log X) h(−4n) − 1
√
−
2 X
Note that
(log X) h(−4n) − 1
√
X
e
ψ(X;
χ)
√
X
X
p6X 1/2
nε
.
log X
n1/2 (log n)2 (log X)2
X 1/4
X
1 − χ−4n (p) + Oε
χ−4n (p) p6X 1/2
under the Generalized Riemann Hypothesis for Dirichlet characters (see [MV07, Theorem 13.7],
for example). Moreover, we have
X
π̆(X; n) − π̆(X; m) =
χ−4n (p) − χ−4m (p) + O(log nm).
p6X
So
0
0
E(X; n, m) = 2ω (m) − 2ω (n) +
X
b
χ∈H(−4n)
(−4n)
χ6=χ0
log X X
+ √
X
e
ψ(X;
χ)
√
−
X
χ−4n (p) − χ−4m (p)
e
ψ(X;
χ)
√
X
X
b
χ∈H(−4m)
(−4m)
χ6=χ0
+ Oε
(3.10)
(n + m)1/2+ε log X
p6X
.
e
Using once again the explicit formula for ψ(X;
χ) for class group characters χ as well as the
explicit formula for ψ(X; χ) for χ = χ−4n and χ = χ−4m (compare [RS94, (2.12)]), we arrive at
0
0
2ω (m) − 2ω (n)
E(X; n, m) =
−
X
X
|γ|<T
b
χ∈H(−4n)
(−4n) L(1/2+iγ,λχ )=0
χ6=χ0
X
−
|γ|<T
L(1/2+iγ,χ−4n )=0
+ Oε (n + m)1+ε
X iγ
+
1
2 + iγ
X iγ
+
1
2 + iγ
|γ|<T
L(1/2+iγ,χ−4m )=0
+
X
|γ|<T
b
χ∈H(−4m)
(−4m) L(1/2+iγ,λχ )=0
χ6=χ0
X
X 1/2 (log T )2
T
X
X iγ
1
2 + iγ
X iγ
1
2 + iγ
(3.11)
1 log X
for all T > 1 and all ε > 0. But this is of the same form as “E(X; q, a)” in [RS94, (2.5), (2.6)];
to be precise: It is of the same form that the corresponding difference E(X; q, a) − E(X; q, b)
3.3 Prime number races for forms of the shape x2 + ny 2
85
of Rubinstein–Sarnak would take in [RS94, (2.5), (2.6)]. Note that the factor “χ(a)” which
appears in their formula is hidden in the sums of the second line of the right side of (3.11): The
corresponding factors would be χ(P (−4n)) and χ(P (−4m)), respectively, but these are equal
to 1 for all class group characters χ as P (−4n) and P (−4m) are the trivial elements of the
respective ideal class groups.
Now, this means that the remaining part of the proof of Theorem 1.1 in [RS94, §2.1] can
be applied verbatim here. This finishes the proof of the first assertion of Theorem 3.4, i.e. the
proof of the existence of a limiting distribution of E(X; n, m).
Next, we turn to the paper of Fiorilli and Martin [FM13]. The second assertion of our
Theorem 3.4 corresponds to their Proposition 2.6. Their proof uses:
(i) Their Proposition 2.3, which establishes the relation between the sums over zeros and the
random variables Yγ : The assumption that the random variables Zγ (and therefore also the
random variables Yγ ) are independent requires in [FM13] the assumption of the Grand
Simplicity Hypothesis GSHq , which we have introduced in Section 3.1. Therefore we
require the assumption of an appropriate analogue of this linear independence hypothesis
here. In fact, our hypothesis LIn,m is an appropriate analogue: Set
b 0 (−4n) = H(−4n)
b
b 0 (n, m),
H
rH
5
4
b 0 (−4n) = χ
H
−4n/d : d ∈ D(n, m) and
6
−4n
d
∈
/ D(n, m) ,
b 0 (n, m) were defined in Definition 3.2. The second and third line of
where D(n, m) and H
4
the right side of (3.11) can then be rewritten as
X
X
b50 (−4n)
χ∈H
|γ|<T
L(1/2+iγ,λχ )=0
−
X
X
b60 (−4n)
χ∈H
|γ|<T
L(1/2+iγ,χ)=0
−
X iγ
+
1
2 + iγ
X iγ
+
1
2 + iγ
X
X
b50 (−4m)
χ∈H
|γ|<T
L(1/2+iγ,λχ )=0
X
X
b60 (−4m)
χ∈H
|γ|<T
L(1/2+iγ,χ)=0
X iγ
1
2 + iγ
(3.12)
X iγ
1
2 + iγ
because the zeros of the L-functions L(s, χd ) for d ∈ D(n, m) r {1} cancel out; see
Remark 3.3. There we have also noted that L(s, λχ ) = L(s, λχ ). Thus we may rewrite the
second and third line of the right side of (3.11) as
− 2
−
X
X
b10 (−4n)
χ∈H
|γ|<T
L(1/2+iγ,χ)=0
X
X
b70 (−4n)
χ∈H
|γ|<T
L(1/2+iγ,χ)=0
X iγ
+ 2
1
2 + iγ
X iγ
+
1
2 + iγ
X
X
b10 (−4m)
χ∈H
|γ|<T
L(1/2+iγ,χ)=0
X
X
b70 (−4m)
χ∈H
|γ|<T
L(1/2+iγ,χ)=0
X iγ
1
2 + iγ
X iγ
,
1
2 + iγ
where
b 0 (−4n) =
H
7
b 0 (−4n) ∪ H
b 0 (−4n) r H
b 0 (n, m).
H
2
6
4
By our hypothesis LIn,m in Definition 3.2, all remaining zeros are linearly independent
over Q; this assumption is therefore indeed the right analogue of the Grand Simplicity
Hypothesis. Thus, the proof of an appropriate analogue of [FM13, Proposition 2.3] goes
through verbatim.
86
Chebyshev’s bias and prime number races for binary quadratic forms
(ii) The analogue in [RS94] of our equation (3.11): Letting T → ∞ in (3.11) with the second
and third line replaced by the rewritten form (3.12), we get
0
0
X
E(X; n, m) = 2ω (m) − 2ω (n) +
(δχ,m − δχ,n )
b
χ∈H(n,m)
+ On,m
X
γ∈R
L(1/2+iγ,χ)=0
X iγ
1
2 + iγ
(3.13)
1
,
log X
where
(
δχ,n =
1
0
b 0 (−4n) ∪ H
b 0 (−4n) =: H
b 0 (−4n),
if χ ∈ H
5
6
8
otherwise,
b
b 0 (−4n) ∪ H
b 0 (−4m). Thus, equation (3.13) has now the same form as the
and H(n,
m) = H
8
8
second displayed equation after [FM13, (2.3)] and we are therefore in the same situation
to finish the proof of the second assertion of Theorem 3.4.
Finally, the proof of our assertion about the variance of E(X; n, m) is basically the same as the
proof of [FM13, Proposition 2.7], except that we have to factor in the double appearance of the
zeros of the L-functions that are associated to the complex class group characters: Since the
variance of 2Yγ equals four times the variance of Yγ , the extra factor m(χ) appears in (3.5).
The proof of Theorem 3.4 may therefore be considered complete.
Now we come to the proof of Theorem 3.5. Its initial assertion, which explains the existence
or non-existence of a bias in our prime number races, follows along the lines of §3.1 and §3.2
in [RS94]; it is basically a consequence of the existence of the limiting distribution that we
have found in Theorem 3.4, the linear independence hypothesis and the symmetry of the Bessel
function J0 . The double appearance of some zeros does not pose a problem since we know – or
rather stipulated – which zeros appear twice and thus we can adjust the proof in the same way
that we have adjusted the linear independence hypothesis LIn,m and the proof of Theorem 3.4.
In order to prove the remaining statements in Theorem 3.5, we may continue to follow
[FM13, §2.2 – §3.2]. In fact, the proof simplifies in our situation: We do not have to deal with
imprimitive characters and whenever they have an expression involving “|χ(a) − χ(b)|” this is
replaced by “|δχ,n − δχ,m | = 1” here. Basically, it usually suffices to replace “c(q, b) − c(q, a)” by
0
0
“2ω (m) −2ω (n) ”, “χ(a)−χ(b)” by “δχ,n −δχ,m ” and “ϕ(q)” by “2h(−4n)+2h(−4m) = 4h(−4n)”
b 0 (−4n)| + 4|H
b 0 (−4n)| + |H
b 0 (−4m)| + |H
b 0 (−4m)|”, but this is not
(or, to be precise, by “4|H
1
1
7
7
significant for the asymptotic considerations) in their proofs; note that the factor 2 stems from
the weight m(χ) for complex class group characters in (3.5).
Along the way we encounter only few points which demand slight intervention:
(i) In their Proposition 2.15, Fiorilli and Martin use a completely explicit bound for N (T, χ),
the number of zeros of L( 21 + it, χ) with |t| 6 T . We may replace this bound with a
non-explicit estimate like the one in [IK04, Theorem 5.8] since we do not aim to find an
explicit error term in Theorem 3.5.
(ii) A key ingredient of their proof is an integral formula for the logarithmic density, which they
quote from a paper of Feuerverger and Martin (see [FM13, Proposition 2.18]). Feuerverger
and Martin derive this formula from a much more general result of theirs, but the proof of
this special case is also sketched in [RS94] (see equations (4.1) and (4.2) there) and does
not need any essential adaptations to satisfy our needs; similarly to the first assertion
of Theorem 3.5, the formula follows from the existence of the limiting distribution in
Theorem 3.4 and the linear independence hypothesis. A slightly more detailed proof of
the integral formula can be found in [Ng00, §5.2 + §5.3.1].
3.3 Prime number races for forms of the shape x2 + ny 2
87
(iii) In order to compute the asymptotic behaviour of the variance, two ingredients are important: First, a formula which relates the sums
X
γ∈R
L(1/2+iγ,χ)=0
1
4
1
+ γ2
0
(1,χ)
to the real parts of the quotients LL(1,χ)
is required. A general formula which also covers class group characters can be found by combining [IK04, Theorem 5.6] and [IK04,
Proposition 5.7]. Second, a bound for these quotients is needed and we find such a bound
in [IK04, Theorem 5.17]. Thus, we split the variance into multiple parts: the contributions
b 0 (−4n), in H
b 0 (−4m), in H
b 0 (−4n) and in H
b 0 (−4m).
from the sums over characters in H
5
5
6
6
Then we apply these formulas to all parts separately, we note that
X
m(χ) ∼ 4h(−4n)
b
χ∈H(n,m)
as n, m → ∞, and thus we get (3.7) as in [FM13, §3.1 + §3.2].
On a side note, we remark that it is not possible to fix either n or m and let the other grow
because the condition h(−4n) = h(−4m) and the class number bounds (2.11) and (2.32)
prevent this.
Note that the proof is capable to give, in fact, a direct analogue of the more general formula
[FM13, (1.1)].
Finally, the concluding assertions of Theorem 3.5 are a consequence of (3.7) and the class
number bound (2.32). This finishes the proof of Theorem 3.5 and Corollary 3.6 follows at once.
3.3.3
Possible extensions
The great deal of recent research in comparative prime number theory for primes in arithmetic
progressions provides multiple ways to extend the results of this section:
(1) Prime number races with more than two competing discriminants could be investigated,
i.e. one could determine
the existence (and size) of any deviation of the logarithmic
densities of the sets X > 2 | π0 (X; n1 ) > · · · > π0 (X; nr ) from the symmetric value r!1
for all r-tuples (n1 , . . . , nr ) of distinct positive (squarefree) integers ni 6≡ 3 (mod 4) with
h(−4n1 ) = · · · = h(−4nr ).
(2) A more detailed analysis should make it possible to compute explicitly the logarithmic
densities for certain races or to give explicit general bounds for all such races. This would
require either the numerical computation of many zeros of class group L-functions (as
in [RS94] and [Ng00]) or the calculation of completely explicit estimates for the corresponding numbers of zeros on the critical line up to any given height (as in [FM13]) – which
do not seem to exist in the literature, but which could probably be extracted from [LO77]
(compare Remark 1.19).
(3) The low-lying zeros of (real) Dirichlet L-functions have a major effect on the bias in
prime number races between quadratic residues and non-residues; see [FM13, §3.6], for
example. Thus, it would be interesting to see whether the results of Fouvry and Iwaniec
that we sketched in Remark 2.24 can give additional information on the prime number
races between forms of the shape x2 + ny 2 .
88
Chebyshev’s bias and prime number races for binary quadratic forms
(4) Eventually, it would also be worthwhile to investigate the need of the unproved assumptions. In particular, one could try to find out whether the existence of certain hypothetical
zeros off the critical line would distort the densities of the sets P (n, m) in an unusual way.
This last point then also raises the question whether the two antithetic types of results in this
thesis, those on “uniformity on average” and those on discrepancies in the distribution of prime
numbers represented by binary quadratic forms, could somehow connect with each other. Since
the mean-value results of Bombieri–Vinogradov type and Barban–Davenport–Halberstam type
have often turned out to be apt substitutes for the Generalized Riemann Hypothesis, one would
hope to find a way to introduce them in questions on prime number races. Of course, one
cannot reasonably expect that they may be of any use when one considers biases for a fixed
discriminant or for a fixed residue class as in the first two sections of this chapter – just as little
as for the fixed pairs of discriminants that we have considered in this section. Also, the ranges
in the results of the second chapter would probably first have to be improved to reach the status
of a “Generalized Riemann Hypothesis on average” to be potentially useful for prime number
races of binary quadratic forms. However, it is conceivable that interesting unconditional results
could then be attained for pairs of growing discriminants, for example. But this is another story
and should be investigated at another time.
Bibliography
[Bar66]
Mark B. Barban, The “large sieve” method and its application to number theory, Russian Mathematical
Surveys 21 (1966), no. 1, 49–103. Translated by H. J. Godwin (Russian original appeared in Uspehi
Mat. Nauk 21 (1), 1966).
[Ber12]
Paul Bernays, Über die Darstellung von positiven, ganzen Zahlen durch die primitiven, binären
quadratischen Formen einer nicht-quadratischen Diskriminante, Dissertation (Universität Göttingen),
1912.
[BG06]
Valentin Blomer and Andrew Granville, Estimates for representation numbers of quadratic forms,
Duke Math. J. 135 (2006), no. 2, 261–302.
[BGHZ08] Jan Hendrik Bruinier, Gerard van der Geer, Günter Harder, and Don Zagier, The 1-2-3 of modular
forms, Universitext, Springer-Verlag, Berlin, 2008.
[BHM07]
Valentin Blomer, Gergely Harcos, and Philippe Michel, Bounds for modular L-functions in the level
aspect, Ann. Sci. École Norm. Sup. (4) 40 (2007), no. 5, 697–740.
[Blo04a]
Valentin Blomer, Binary quadratic forms with large discriminants and sums of two squareful numbers,
J. Reine Angew. Math. 569 (2004), 213–234.
[Blo04b]
Valentin Blomer, Non-vanishing of class group L-functions at the central point, Ann. Inst. Fourier
(Grenoble) 54 (2004), no. 4, 831–847.
[Bom65]
Enrico Bombieri, On the large sieve, Mathematika 12 (1965), 201–225.
[Bom87]
Enrico Bombieri, Le grand crible dans la théorie analytique des nombres, Astérisque 18 (1987), 103 pp.
[Br95]
Jörg Brüdern, Einführung in die analytische Zahlentheorie, Springer-Verlag, Berlin, 1995.
[BST13]
Manjul Bhargava, Arul Shankar, and Jacob Tsimerman, On the Davenport–Heilbronn theorems and
second order terms, Invent. Math. 193 (2013), no. 2, 439–499.
[Bue89]
Duncan A. Buell, Binary quadratic forms, Springer-Verlag, New York, 1989.
[BV07]
Johannes Buchmann and Ulrich Vollmer, Binary quadratic forms, Algorithms and Computation in
Mathematics, vol. 20, Springer-Verlag, Berlin, 2007.
[BZ07]
Stephan Baier and Liangyi Zhao, Primes in quadratic progressions on average, Math. Ann. 338 (2007),
no. 4, 963–982.
[Coh93]
Henri Cohen, A course in computational algebraic number theory, Graduate Texts in Mathematics,
vol. 138, Springer-Verlag, Berlin, 1993.
[Cox97]
David A. Cox, Primes of the form x2 + ny 2 : Fermat, class field theory and complex multiplication,
Paperback ed., A Wiley-Interscience Publication, John Wiley & Sons Inc., New York, 1997.
[Dav00]
Harold Davenport, Multiplicative number theory, Third ed., Graduate Texts in Mathematics, vol. 74,
Springer-Verlag, New York, 2000.
[DFI02]
William Duke, John Friedlander, and Henryk Iwaniec, The subconvexity problem for Artin L-functions,
Invent. Math. 149 (2002), no. 3, 489–577.
[Dit13]
Jakob Ditchen, On the average distribution of primes represented by binary quadratic forms, ArXiv
e-print (December 2013), available at http://arxiv.org/abs/1312.1502.
[DK00]
William Duke and Emmanuel Kowalski, A problem of Linnik for elliptic curves and mean-value estimates for automorphic representations, Invent. Math. 139 (2000), no. 1, 1–39.
[dlVP96]
Charles-Jean de la Vallée Poussin, Recherches analytiques sur la théorie des nombres premiers, Ann.
Soc. Sci. Bruxelles 20 (1896), 183–256, 281–362, 363–397.
[dlVP97]
Charles-Jean de la Vallée Poussin, Recherches analytiques sur la théorie des nombres premiers, Ann.
Soc. Sci. Bruxelles 21 (1897), 251–342, 343–368.
90
Bibliography
[EH71]
Peter D. T. A. Elliott and Heini Halberstam, The least prime in an arithmetic progression, in: Studies
in Pure Mathematics (Presented to Richard Rado), Academic Press, 1971, pp. 59–61.
[FG92]
John Friedlander and Andrew Granville, Limitations to the equi-distribution of primes. III, Compositio
Math. 81 (1992), no. 1, 19–32.
[FI03a]
Étienne Fouvry and Henryk Iwaniec, Low-lying zeros of dihedral L-functions, Duke Math. J. 116
(2003), no. 2, 189–217.
[FI03b]
John Friedlander and Henryk Iwaniec, Exceptional characters and prime numbers in arithmetic progressions, Int. Math. Res. Not. 37 (2003), 2033–2050.
[FI10]
John Friedlander and Henryk Iwaniec, Opera de cribro, American Mathematical Society Colloquium
Publications, vol. 57, American Mathematical Society, Providence, RI, 2010.
[FI98]
John Friedlander and Henryk Iwaniec, The polynomial X 2 + Y 4 captures its primes, Ann. of Math.
(2) 148 (1998), no. 3, 945–1040.
[FK02]
Kevin Ford and Sergei Konyagin, Chebyshev’s conjecture and the prime number race, in: IV International Conference “Modern Problems of Number Theory and its Applications” (Tula, 2001), Mosk.
Gos. Univ. im. Lomonosova, Mekh.-Mat. Fak., Moscow, 2002, pp. 67–91.
[FLK13]
Kevin Ford, Youness Lamzouri, and Sergei Konyagin, The prime number race and zeros of Dirichlet
L-functions off the critical line: Part III, Q. J. Math. 64 (2013), no. 4, 1091–1098.
[FM13]
Daniel Fiorilli and Greg Martin, Inequities in the Shanks-Rényi Prime Number Race: An asymptotic
formula for the densities, J. Reine Angew. Math. 676 (2013), 121–212.
[Fog61]
Ernests Fogels, On the distribution of prime ideals, Acta Arith. 7 (1961/1962), 255–269.
[Fog65]
Ernests Fogels, On the zeros of L-functions, Acta Arith 11 (1965), 67–96.
[Fog67]
Ernests Fogels, Corrigendum: “On the zeros of L-functions”, Acta Arith. 14 (1967/1968), 435.
[Gal68]
P. X. Gallagher, Bombieri’s mean value theorem, Mathematika 15 (1968), 1–6.
[Gel75]
Stephen S. Gelbart, Automorphic forms on adèle groups, Annals of Mathematics Studies, No. 83,
Princeton University Press, Princeton, N.J., 1975.
[GM06]
Andrew Granville and Greg Martin, Prime number races, Amer. Math. Monthly 113 (2006), no. 1,
1–33.
[Gol06]
Dorian Goldfeld, Automorphic forms and L-functions for the group GL(n, R), Cambridge Studies in
Advanced Mathematics, vol. 99, Cambridge University Press, Cambridge, 2006.
[Gol70]
Larry Joel Goldstein, A generalization of the Siegel-Walfisz theorem, Trans. Amer. Math. Soc. 149
(1970), 417–429.
[GPY09]
Daniel A. Goldston, János Pintz, and Cem Y. Yıldırım, Primes in tuples. I, Ann. of Math. (2) 170
(2009), no. 2, 819–862.
[GSS07]
Catherine Goldstein, Norbert Schappacher, and Joachim Schwermer (eds.), The shaping of arithmetic
after C. F. Gauss’s disquisitiones arithmeticae, Springer-Verlag, Berlin, 2007.
[HB09]
D. R. Heath-Brown, Convexity bounds for L-functions, Acta Arith. 136 (2009), no. 4, 391–395.
[HB95]
D. R. Heath-Brown, A mean value estimate for real character sums, Acta Arith. 72 (1995), no. 3,
235–275.
[HBM02]
D. R. Heath-Brown and B. Z. Moroz, Primes represented by binary cubic forms, Proc. London Math.
Soc. (3) 84 (2002), no. 2, 257–288.
[Hec17]
Erich Hecke, Über die L-Funktionen und den Dirichletschen Primzahlsatz für einen beliebigen
Zahlkörper, Nachrichten der K. Gesellschaft der Wissenschaften zu Göttingen, Mathematischphysikalische Klasse (1917), 299–318.
[Hin81]
Jürgen G. Hinz, On the theorem of Barban and Davenport-Halberstam in algebraic number fields, J.
Number Theory 13 (1981), no. 4, 463–484.
[HM06]
Gergely Harcos and Philippe Michel, The subconvexity problem for Rankin-Selberg L-functions and
equidistribution of Heegner points. II, Invent. Math. 163 (2006), no. 3, 581–655.
[HM13]
Roman Holowinsky and Ritabrata Munshi, Level aspect subconvexity for Rankin-Selberg L-functions,
in: Automorphic Representations and L-functions, Tata Inst. Fund. Res., 2013, pp. 311–334.
[Hoo75]
Christopher Hooley, On the Barban-Davenport-Halberstam theorem. I, J. Reine Angew. Math.
274/275 (1975), 206–223.
Bibliography
91
[HR11]
Heini Halberstam and Hans-Egon Richert, Sieve methods, Dover Publications, 2011. Republication of
the work originally published in 1974 by Academic Press.
[IK04]
Henryk Iwaniec and Emmanuel Kowalski, Analytic number theory, American Mathematical Society
Colloquium Publications, vol. 53, American Mathematical Society, Providence, RI, 2004.
[Iwa97]
Henryk Iwaniec, Topics in classical automorphic forms, Graduate Studies in Mathematics, vol. 17,
American Mathematical Society, Providence, RI, 1997.
[KM02]
Emmanuel Kowalski and Philippe Michel, Zeros of families of automorphic L-functions close to 1,
Pacific J. Math. 207 (2002), no. 2, 411–431.
[KM97]
Emmanuel Kowalski and Philippe Michel, Sur les zéros des fonctions L automorphes de grand niveau,
ArXiv e-print (1997), available at http://arxiv.org/abs/math/9707238v1.
[Kow04]
Emmanuel Kowalski, Un cours de théorie analytique des nombres, Cours Spécialisés, vol. 13, Société
Mathématique de France, Paris, 2004.
[Kow08]
Emmanuel Kowalski, The large sieve and its applications, Cambridge Tracts in Mathematics, vol. 175,
Cambridge University Press, Cambridge, 2008.
[KT64]
S. Knapowski and P. Turán, Further developments in the comparative prime-number theory. II. A
modification of Chebyshev’s assertion, Acta Arith. 10 (1964), 293–313.
[Kud03]
Stephen S. Kudla, From modular forms to automorphic representations, in: An introduction to the
Langlands program (Jerusalem, 2001), Birkhäuser Boston, 2003, pp. 133–151.
[Lan06]
Edmund Landau, Über einen Satz von Tschebyschef, Math. Ann. 61 (1906), no. 4, 527–550.
[Lan14]
Edmund Landau, Über die Primzahlen in definiten quadratischen Formen und die Zetafunktion reiner
kubischer Körper, in: Schwarz-Festschrift, Berlin, 1914, pp. 244–273.
[Lan18]
Edmund Landau, Über einige ältere Vermutungen und Behauptungen in der Primzahltheorie, Math.
Z. 1 (1918), 1–24.
[Li79]
Wen-Ch’ing Winnie Li, L-series of Rankin type and their functional equations, Math. Ann. 244 (1979),
no. 2, 135–166.
[Lin44]
U. V. Linnik, On the least prime in an arithmetic progression, Rec. Math. [Mat. Sbornik] N.S. 15(57)
(1944), 139–178 and 347–368.
[LO77]
J. C. Lagarias and A. M. Odlyzko, Effective versions of the Chebotarev density theorem, in: Algebraic
number fields: L-functions and Galois properties (Proc. Sympos., Univ. Durham, Durham, 1975),
Academic Press, 1977, pp. 409–464.
[LP92]
H. W. Lenstra Jr. and Carl Pomerance, A rigorous time bound for factoring integers, J. Amer. Math.
Soc. 5 (1992), no. 3, 483–516.
[Mic07]
Philippe Michel, Analytic number theory and families of automorphic L-functions, in: Automorphic
forms and applications, IAS/Park City Math. Ser., vol. 12, Amer. Math. Soc., 2007, pp. 181–295.
[MM87]
M. Ram Murty and V. Kumar Murty, A variant of the Bombieri-Vinogradov theorem, in: Number
theory (Montreal, Que., 1985), CMS Conf. Proc., vol. 7, Amer. Math. Soc., 1987, pp. 243–272.
[MP13]
M. Ram Murty and Kathleen L. Petersen, A Bombieri-Vinogradov theorem for all number fields,
Trans. Amer. Math. Soc. 365 (2013), no. 9, 4987–5032.
[MS99]
Andrey Markov and Nikolay Sonin (eds.), Œuvres de P.L. Tchebychef, Commissionaires de l’Académie
impériale des sciences, Saint Petersburg, 1899.
[MtR04]
Pieter Moree and Herman J. J. te Riele, The hexagonal versus the square lattice, Math. Comp. 73
(2004), no. 245, 451–473.
[MV07]
Hugh L. Montgomery and Robert C. Vaughan, Multiplicative number theory. I. Classical theory,
Cambridge Studies in Advanced Mathematics, vol. 97, Cambridge University Press, Cambridge, 2007.
[MV10]
Philippe Michel and Akshay Venkatesh, The subconvexity problem for GL2 , Publ. Math. Inst. Hautes
Études Sci. 111 (2010), 171–271.
[Nar00]
Władysław Narkiewicz, The development of prime number theory, Springer Monographs in Mathematics, Springer-Verlag, Berlin, 2000.
[Nar04]
Władysław Narkiewicz, Elementary and analytic theory of algebraic numbers, Third ed., Springer
Monographs in Mathematics, Springer-Verlag, Berlin, 2004.
[Neu92]
Jürgen Neukirch, Algebraische Zahlentheorie, Springer-Verlag, Berlin, 1992.
92
Bibliography
[Ng00]
Nathan Ng, Limiting distributions and zeros of Artin L-functions, Dissertation (University of British
Columbia), 2000.
[RS94]
Michael Rubinstein and Peter Sarnak, Chebyshev’s bias, Experiment. Math. 3 (1994), no. 3, 173–197.
[Rud87]
Walter Rudin, Real and complex analysis, Third ed., McGraw-Hill Book Co., New York, 1987.
[Sch86]
P. D. Schumer, On the large sieve inequality in an algebraic number field, Mathematika 33 (1986),
no. 1, 31–54.
[Ser81]
Jean-Pierre Serre, Quelques applications du théorème de densité de Chebotarev, Inst. Hautes Études
Sci. Publ. Math. 54 (1981), 323–401.
[Sha59]
Daniel Shanks, Quadratic residues and the distribution of primes, Math. Tables Aids Comput. 13
(1959), 272–284.
[Vin65]
Askold I. Vinogradov, On the density hypothesis for Dirichet L-series (Russian), Izv. Akad. Nauk
SSSR Ser. Mat. 29 (1965), 903–934.
[Vin66]
Askold I. Vinogradov, Correction to the paper of A. I. Vinogradov “On the density hypothesis for
Dirichlet L-series” (Russian), Izv. Akad. Nauk SSSR Ser. Mat. 30 (1966), 719–720.
[Web82]
Heinrich Weber, Beweis des Satzes, dass jede eigentlich primitive quadratische Form unendlich viele
Primzahlen darzustellen fähig ist, Math. Ann. 20 (1882), no. 3, 301–329.
[Wei83]
Alfred Weiss, The least prime ideal, J. Reine Angew. Math. 338 (1983), 56–94.
[Xyl11]
Triantafyllos Xylouris, On the least prime in an arithmetic progression and estimates for the zeros of
Dirichlet L-functions, Acta Arith. 150 (2011), no. 1, 65–91.
[Zag81]
Don Zagier, Zetafunktionen und quadratische Körper, Springer-Verlag, Berlin, 1981.
[Zha13]
Yitang Zhang, Bounded gaps between primes, Preprint (2013). To appear in Annals of Math.
Jakob J. Ditchen
Zürich, Autumn 2013
Download