Uploaded by Hulda Narkevicius

on-the-triple-correlations-of-fractional-parts-of-n2alpha-

advertisement
Canad. J. Math. 2021, pp. 1–38
http://dx.doi.org/10.4153/S0008414X21000249
© Canadian Mathematical Society 2021
On the triple correlations of fractional parts
of n2 α
Niclas Technau and Aled Walker
Abstract. For fixed α ∈ [0, 1], consider the set S α,N of dilated squares α, 4α, 9α, . . . , N 2 α modulo 1.
Rudnick and Sarnak conjectured that, for Lebesgue, almost all such α the gap-distribution of S α,N
is consistent with the Poisson model (in the limit as N tends to infinity). In this paper, we prove
a new estimate for the triple correlations associated with this problem, establishing an asymptotic
expression for the third moment of the number of elements of S α,N in a random interval of length
L/N, provided that L > N 1/4+ε . The threshold of 41 is substantially smaller than the threshold of 21
(which is the threshold that would be given by a naïve discrepancy estimate).
Unlike the theory of pair correlations, rather little is known about triple correlations of the
∞
dilations (αa n mod 1)∞
n=1 for a nonlacunary sequence (a n ) n=1 of increasing integers. This is partially
due to the fact that the second moment of the triple correlation function is difficult to control, and
thus standard techniques involving variance bounds are not applicable. We circumvent this impasse
by using an argument inspired by works of Rudnick, Sarnak, and Zaharescu, and Heath-Brown,
which connects the triple correlation function to some modular counting problems.
In Appendix B, we comment on the relationship between discrepancy and correlation functions,
answering a question of Steinerberger.
1 Introduction
Let (a n )∞
n=1 be a strictly increasing sequence of positive integers. This paper is
concerned with the distribution of (αa n mod 1)∞
n=1 in short intervals, for a generic
dilate α ∈ [0, 1], with particular focus on the case when a n = n 2 .
We begin with the familiar notion of the discrepancy D N = D N ((αa n mod 1)∞
n=1 ),
defined to be
(1.1)
D N = sup ∣
0<a<b<1
∣{n ≤ N ∶ αa n mod 1 ∈ (a, b)}∣
− (b − a)∣ .
N
It is an old result of Weyl, contained in his 1916 paper [29], that D N ((αn 2 mod 1)∞
n=1 ) =
o α (1) as N → ∞ for any irrational α. The sequence (αn 2 mod 1)∞
is
then
said
to
n=1
be equidistributed modulo 1. One way of viewing Weyl’s result is as demonstrating a
Received by the editors July 21, 2020; revised April 14, 2021; accepted April 19, 2021.
Published online on Cambridge Core May 4, 2021.
While the work toward this paper was being carried out, NT was supported by the European
Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 786758) and by the Austrian Science Fund (FWF): project J-4464. A.W.
is supported by a Junior Research Fellowship from Trinity College Cambridge, and was also supported
by a Post-Doctoral Fellowship at the Centre de Recherches Mathématiques.
AMS subject classification: 11K06, 11K60, 11L05.
Keywords: Triple correlations, uniform distribution, discrepancy.
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
2
N. Technau and A. Walker
pseudorandomness property for the sequence (αn 2 mod 1)∞
n=1 . Indeed, in the random
N
is replaced by N independent random variables
model in which (αn 2 mod 1)n=1
X 1 , . . . , X N , which are uniformly distributed on [0, 1), one has1
√
2N D N ((X n )∞
n=1 )
√
lim sup
=1
log log N
N→∞
almost surely. In particular ED N ((X n )∞
n=1 ) = o(1) as N → ∞.
One might wonder, considering the strong quantitative decay enjoyed in the
random model, whether Weyl’s result admits such a quantitative refinement. Unfortunately, as is well known, if α is well approximated by rational numbers with small
denominators, then the discrepancy D N ((αn 2 mod 1)∞
n=1 ) can tend to zero extremely
slowly. However, for a generic α, the situation is much improved, and in fact, for any
strictly increasing sequence of positive integers (a n )∞
n=1 , one has the classical result of
Erdős and Koksma [9], which implies that
− 2 +ε
D N ((αa n mod 1)∞
)
n=1 ) = O ε (N
1
for Lebesgue almost all α ∈ [0, 1]. This trivially implies that, for almost all α ∈ [0, 1], if
1
I ⊂ R/Z is a fixed interval of length ∣I∣ ⩾ N − 2 +ε , then
(1.2)
∣{n ⩽ N ∶ αn 2 mod 1 ∈ I}∣ = (1 + o ε (1))N∣I∣.
So, at least for a generic α, pseudorandomness is enjoyed down to the scale N −1/2+ε .
A result of Aistleitner and Larcher [1, Corollary 1] implies that the exponent 21 is
optimal for the metric discrepancy problem: for generic α and for each ε > 0, there
−1/2−ε
are infinitely many N such that D N ((αa n mod 1)∞
, provided a n = P(n)
n=1 ) > N
for some polynomial P, of degree at least two, with integer coefficients.
By allowing the interval I to vary, one can develop a related notion of pseudorandomness at scales that are smaller than N −1/2 , one which concerns the “clustering” of
the points αn 2 mod 1. To introduce this notion, which will be the main focus of the
paper, we let Y be a random variable that is uniformly distributed on [0, 1), and for
a natural number N and a parameter L in the range 0 < L ⩽ N, we let Wα,L,N be the
random variable
(1.3)
Wα,L,N ∶= ∣{n ⩽ N ∶ αn 2 ∈ [Y , Y + L/N] mod 1}∣.
It is easy to see that EWα,L,N = L. But how should one expect Wα,L,N to be distributed
in the limit N → ∞ (for a generic dilate α)? Consider the same random model as
N
before, in which (αn 2 mod 1)n=1
is replaced by N independent random variables
X 1 , . . . , X N , which are uniformly distributed on [0, 1). Then, if L is constant as N → ∞,
letting
(1.4)
Z L,N ∶= ∣{n ⩽ N ∶ X n ∈ [Y , Y + L/N] mod 1}∣,
1
This is by the law of the iterated logarithm. See Khintchine [14], as well as Chung [6, Theorem 2∗ ]
and Kuipers and Niederreiter [15, p. 98].
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
3
one may calculate that
d i st
Z L,N ÐÐ→ Po(L)
as N → ∞, where Po(L) is a Poisson-distributed random variable with parameter L.
Having described this random model, we can now state the following remarkable
conjecture.
Conjecture 1.1 (Rudnick and Sarnak2 [23].)
L > 0,
For almost all α ∈ [0, 1], for all fixed
d i st
Wα,L,N ÐÐ→ Po(L),
as N → ∞.
If true, this conjecture would represent a strong local notion of pseudorandomness
for the sequence (αn 2 mod 1)∞
n=1 , at least for a generic α. In fact, a further conjecture
[24, p. 38] posits more information about the full-measure set of suitable dilates α. To
state this conjecture, we recall that α is of type ω if there are only finitely many pairs
(a, q) with ∣α − a/q∣ < q−ω .
Conjecture 1.2 (Rudnick, Sarnak, and Zaharescu.)
and the convergents a/q to α satisfy
If α is of type 2 + ε for all ε > 0,
lim log q̃/ log q = 1,
q→∞
where q̃ is the square-free part of q, then, for all fixed L > 0, as N → ∞, one has
d i st
Wα,L,N ÐÐ→ Po(L).
Remark: We remind the reader that, by Dirichlet’s approximation theorem, the type
of a real number is never less than 2. Furthermore, by Khintchine’s theorem, a generic
number
√ is of type 2 + ε for all ε > 0. One can also readily find explicit examples, like
α = 2, by using continued fractions.
Conjectures 1.1 and 1.2 appear to lie very deep. One hypothetical approach for
showing the desired convergence in distribution would be to use the method of
moments. More precisely, if one could show, for a generic α, for a random variable
k
X L ∼ Po(L), and for all k ∈ N, that E Wα,L,N
→ E X Lk as N → ∞, then Conjecture
1.1 would follow. Rudnick and Zaharescu [25] used this approach to show that if
(a n )∞
n=1 is a lacunary sequence of natural numbers, then, for almost all α, the sequence
(αa n mod 1)∞
n=1 has spacing statistics that agree with the Poisson model. For the
squares, the first nontrivial case is k = 2, and this was settled by Rudnick and Sarnak3
some 22 years ago.
2
These authors refer to the “distribution of the spacings between the elements,” rather than mentioning the random variables Wα ,L,N directly, but these are equivalent notions. For more on this other
perspective, see the introduction to [24].
3
We should remark that Rudnick and Sarnak phrased their result in terms of the pair correlation
function, rather than explicitly mentioning Wα ,L,N , but the results are equivalent.
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
4
N. Technau and A. Walker
Theorem 1.3 (Rudnick and Sarnak [23]) For almost all α ∈ [0, 1], for all fixed L > 0,
2
= L + L 2 + o α,L (1),
EWα,L,N
as N → ∞.
This gives an estimate on the so-called number variance Var(Wα,L,N ), namely
Var(Wα,L,N ) = L + o α,L (1),
as N → ∞.
Very little is known regarding the higher moments, although certain results can be
extracted from the literature. For larger L, the issue is settled by the aforementioned
discrepancy bounds. Indeed, expression (1.2) implies that, for almost all α ∈ [0, 1], for
all integers k ⩾ 1, for all N ∈ N, and for all L ∈ R in the range N 1/2+ε ⩽ L ⩽ N,
k
EWα,L,N
= L k (1 + o α,ε,k (1)),
(1.5)
as N → ∞ (where the error term is independent of the choice of parameters L). We
will give the simple proof of (1.5) in Appendix B, alongside other consequences of the
discrepancy bounds. In passing, we will answer a question of Steinerberger from [27].
One might wonder whether the methods that Rudnick and Sarnak introduced
to tackle the second moment in Theorem 1.3 could be applied to higher moments.
Unfortunately, in the case when L is constant, Rudnick and Sarnak already noted in
[23, Section 4] that their method faces a major obstruction when applied to higher
moments. We will describe this obstruction in Appendix C, where (for the third
moment) we observe that the obstruction persists for L as large as N 1/3 .
We now present the main result of this paper.
Theorem 1.4 (Main Theorem) Let ε > 0. Then, for almost all α ∈ [0, 1], for all N ∈ N,
and for all L ∈ R in the range N 1/4+ε < L ⩽ N, we have
3
= L 3 (1 + o α,ε (1)),
EWα,L,N
as N → ∞, where the o α,ε (1) term is independent of the choice of the parameters L.
Note, in particular, that 41 < 31 , so we successfully give an asymptotic expression for
the third moment in part of the range in which the Rudnick–Sarnak obstruction holds
(see Appendix C).
To prove Theorem 1.4, we will first perform a standard reduction to a statement
concerning correlation functions. These functions are closely related to the moments
k
EWα,L,N
, but they can be more convenient to analyze.
Definition 1.1 (Correlation functions) Let α ∈ [0, 1], let k ⩾ 2 be a natural number,
and let g ∶ R k−1 → [0, 1] be a compactly supported function. Then, the kth correlation
function R k (α, L, N , g) is defined to be
1
N
∑
1⩽x 1 ,...,x k ⩽N
distinct
g(
N
N
N
2
{α(x 12 − x 22 )}sgn , {α(x 22 − x 32 )}sgn , . . . , {α(x k−1
− x k2 )}sgn ) ,
L
L
L
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
5
where
{⋅}sgn ∶ R Ð→ (−1/2, 1/2]
denotes the signed distance to the nearest integer, and N and L are real parameters.
In practice, one only needs to understand the correlation functions when the test
function g is sufficiently nice, for example, smooth or the indicator function of a box.
When proving Theorem 1.3, Rudnick and Sarnak manipulated the pair correlation
function R 2 (α, L, N , g). We manipulate the triple correlation function R 3 (α, L, N , g),
proving the following result, which, for readers more familiar with correlation functions than with the moments of Wα,L,N , might seem to be more natural.
Theorem 1.5 (Triple correlations) Let ε > 0. Then, for almost all α ∈ [0, 1], for all
compactly supported continuous functions g ∶ R2 Ð→ [0, 1], for all N ∈ N, and for all
L ∈ R in the range N 1/4+ε < L < N 1−ε , we have
R 3 (α, L, N , g) = (1 + o α,ε, g (1))L 2 (∫ g(w) dw) ,
as N → ∞, where the o α,ε, g (1) term is independent of the choice of the parameters L.
Before continuing to survey other relevant papers, we should stop to explain why
the spacing statistics of the sequence αn 2 mod 1 are of a particular interest (aside
from as a part of the larger endeavor of finding pseudorandomness in arithmetic
sequences). This is due to a connection between number theory and theoretical
physics known as arithmetic quantum chaos. In brief, the spacing statistics between
elements of the sequence of αn 2 mod 1 correspond to the spacing statistics between
the eigenvalues of a certain quantum system. (This system is a two-dimensional boxed
oscillator, with a harmonic potential in one direction and hard walls in the other,
as described in the introduction to [23].) A famous and far-reaching observation of
Berry and Tabor [4] then suggests that such spacing statistics in the semiclassical
limit (i.e., the distribution of Wα,L,N for constant L as N → ∞) should be determined
by the dynamics of the corresponding classical system. Regular (integrable) classical
dynamics should correspond to spacing statistics in asymptotic agreement with the
Poisson model.
Unfortunately, if α is rational (or is very well approximated by rationals with square
denominators), it is easy to prove that high moments of Wα,L,N do not agree with the
Poisson model (see [24, Theorem 2])! However, excluding such zero-measure counterexamples,4 one might still hope for a metric result.
There are precious few examples of fixed sequences for
√ which full information
is known about the spacing statistics. For the sequence ( n mod 1)∞
n=1 , Elkies and
McMullen
[8]
have
established,
using
dynamical
methods,
the
gap
distribution of
√
( n mod 1)∞
n=1 (which is not from a Poisson model); El-Baz, Marklof, and Vinogradov
[7] have demonstrated that the second moment of this gap distribution is nonetheless
in accordance with the Poisson model. Furthermore, there are the results in [10] and
[18] on the second moment of the gap distribution between values of quadratic forms.
4
Zaharescu [30] showed, in a precise sense, that, at least among all very well approximable α, these
were the only counterexamples.
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
6
N. Technau and A. Walker
However, most of the results in the literature are metric in at least one parameter (e.g.,
[2, 23, 25, 26]), and such results still have substantial content.
To delve further into the relationship to theoretical physics and to other, spacing
statistics would be to digress too far from our main theme; we direct the interested
reader to the articles of Marklof [17] and Rudnick [20] for more on these issues.
Returning to the study of the moments of Wα,L,N , and the discussion of relevant
work, we continue with the paper [24]. Here, Rudnick, Sarnak, and Zaharescu develop
tools to relate the Diophantine approximation properties of α to the size of the
k
moments EWα,L,N
. The main result of that paper can be phrased as follows.
Theorem 1.6 (Rudnick, Sarnak, and Zaharescu) Let α ∈ [0, 1], and suppose that there
are infinitely many rationals b j /q j , with q j prime, satisfying
(1.6)
∣α −
bj
1
∣< 3.
qj
qj
Then, there is a subsequence N j → ∞, with log N j / log q j → 1 for which, for all L > 0,
d i st
Wα,L,N j ÐÐ→ Po(L),
as j → ∞.
The authors of [24] sacrificed the genericness of α (working instead with those
α which are unusually well-approximable) in favor of control over the moments
k
EWα,L,N
for constant L and for all k. One may switch objectives in their analysis,
j
sacrificing the range of L in order to work with almost all α. Applying this switch in
the context of the third moment calculation, their method shows the following result.
Theorem 1.7 (Method of R–S–Z) Let ε > 0. Then, for almost all α ∈ [0, 1], for all N ∈
N, and for all L ∈ R in the range N 3/5+ε < L ⩽ N, we have
3
EWα,L,N
= L 3 (1 + o α,ε (1)),
as N → ∞, where the o α,ε (1) term is independent of the choice of parameters L.
Moreover, one can give an explicit description of a suitable full-measure set of suitable α
(in terms of properties of rational approximations to α).
Although the range of L in Theorem 1.7 is much smaller than the range in our
result, the method of R–S–Z yields a more explicit description of the full-measure set
of suitable α. We will indicate how to extract Theorem 1.7 from [24] in Section 3 below.
Our approach to proving Theorem 1.5 is inspired by this work of R–S–Z, but also
by the work of Heath-Brown in [12], who introduced a related technique for studying
the pair correlation function R 2 (α, L, N , g). The full description of the method will
come in Section 3, but we sketch the idea here, so as to explain in a rough way how we
extract an improvement over [24]. After having replaced α by a rational approximation
a/q, one transforms the triple correlation function into an expression that counts the
number of solutions to certain polynomial equations modulo q. The equations which
occur are of the form
(1.7)
{x, y, z ⩽ N ∶ x 2 − y 2 ≡ c 1 (mod q), y 2 − z 2 ≡ c 2 (mod q)},
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
7
for certain ranges of N and q and for certain sets of coefficients c 1 and c 2 . In [24],
the number of solutions was estimated by using the “completion of sums” technique
to remove the N cutoff, followed by an implementation of the Hasse–Weil bound. In
our work, we manage to take advantage of the extra averaging over c 1 and c 2 , which is
present in the problem, together with some intricate (although elementary) exponential sum arguments, which ends up leading to a stronger bound for certain ranges of N
and q. Heath-Brown did something similar for the pair correlation function [12], but
the analysis of the relevant exponential sums for the triple correlations is substantially
more delicate.
Another relevant work is the paper of Rudnick and Kurlberg [22], in which those
authors established that the spacing of quadratic residues mod q as the number of
prime factors of q grows is in agreement with the Poisson model. Lemma 3.3 below
could be viewed as a special case of the arguments of that paper. However, as will
become evident, the work here necessarily concerns a rather sparse subset of the set of
quadratic residues mod q, as N ≈ q θ with θ < 1, and so the work of [22] is not directly
applicable. Pair correlations of rational functions mod q were also studied by Boca and
Zaharescu [5], but again, we will not be able to use that paper directly.
Very recently, and after the first version of the present manuscript was submitted,
Lutsko released a preprint [16] which generalized Theorem 1.5 to all long-range
k−2
correlation functions R k (α, L, N , g), provided L ⩾ N 2k−2 +ε . When k = 3, the threshold
k−2
k−2
recovers our threshold of N 1/4 for triple correlations. Note also that 2k−2
→ 21
2k−2
from below as k → ∞, that is, Lutsko’s threshold approaches the trivial discrepancy
bound for large k. The methods of [16] seem to be completely different to our own, and
are instead based on an analysis of the sums ∑n⩽N e(αmn 2 ) using the van der Corput
B process (and subsequent stationary phase estimates). It remains to be seen whether
any improvement to the threshold c 3 = 41 could be derived by combining these two
different approaches.
The structure of the paper is as follows. In Section 2, we will give the standard
argument (passing from moments to correlation functions), which reduces Theorem
1.4 to Theorem 1.5. The proof of Theorem 1.5 is then given in Section 3, in which it is
resolved subject to three auxiliary results (one concerning Diophantine approximation, and the other two concerning the number of solutions to certain Diophantine
equations similar to (1.7)). The final three sections of the paper settle these results—
one per section—thus concluding the main proof.
Appendices A–C contain some arguments that are minor modifications of the literature (but which are nonetheless pertinent to the main paper). These are, respectively,
a version of Theorem 1.3 in which L grows with N, the proof of the asymptotic (1.5), and
the discussion of the obstruction to the study of triple correlations that was identified
by Rudnick and Sarnak.
1.1 Notation
Most of our notation is standard, but perhaps we should highlight a few conventions.
For a natural number q, we let e q (x) be a shorthand for e 2π i x/q , and given a parameter
M ⩾ 1, we let [M] denote the set {m ∈ N ∶ 1 ⩽ m ⩽ M}. In particular, 1[M] denotes the
indicator function of all the natural numbers at most M. If a range of summation is
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
8
N. Technau and A. Walker
given as ∑m⩽M , then it is assumed that m is a natural number and that m ⩾ 1. Finally,
for x ∈ R, we will use ∥x∥ to denote the distance from x to the nearest integer, and
{x}sgn to denote the signed distance to the nearest integer from x.
2 Reduction to correlation functions
In this short section, we will reduce Theorem 1.4 to Theorem 1.5. First, because estimate
(1.5) holds for large L, we may assume, without loss of generality, that L < N 1−ε . Then,
we use linearity of expectation to deduce that
3
=
EWα,L,N
2
2
2
∑ P(αx , α y , αz ∈ [Y , Y + L/N] mod 1)
x , y,z⩽N
= ∑ P(αx 2 ∈ [Y , Y + L/N] mod 1)
x⩽N
+ 3 ∑ P(αx 2 , α y 2 ∈ [Y , Y + L/N] mod 1)
x , y⩽N
x≠y
(2.1)
+
∑
x , y,z,⩽N
distinct
P(αx 2 , α y 2 , αz 2 ∈ [Y , Y + L/N] mod 1),
where Y is a random variable that is uniformly distributed modulo 1. The first of
the three terms in (2.1) is equal to L, and so may be absorbed into the error term
of Theorem 1.4. The second term is
=3
∑
x , y⩽N
x≠y
∥α(x 2 −y 2 )∥⩽L/N
(
L
− ∥α(x 2 − y 2 )∥)
N
⎛
⎞
N
1
2
2
⎟
=3L ⎜
⎜ N ∑ max (0, 1 − L ∥α(x − y )∥)⎟
x
,
y,⩽N
⎝
⎠
x≠y
=3LR 2 (α, L, N , f ),
(2.2)
where f ∶ R Ð→ [0, 1] is the function f (x) = max(0, 1 − ∣x∣) and R 2 (α, L, N , f ) is the
pair correlation function as defined in Definition 1.1. In Appendix A, we will show, by
a trivial adaptation of the known techniques, that, for almost all α ∈ [0, 1], one has
R 2 (α, L, N , f ) = (1 + o α, f (1))L (∫ f (x) dx) = (1 + o α, f (1))L.
(2.3)
Therefore, expression (2.2) is equal to 3L 2 + o α, f (L 2 ), which may also be absorbed
into the error term of Theorem 1.4.
What remains is the third term of (2.1). This is equal to
(2.4)
L
N
∑ max(0, (1 −
x , y,z⩽N
distinct
N
max(∥α(x 2 − y 2 )∥, ∥α(y 2 − z 2 )∥, ∥α(z 2 − x 2 )∥)).
L
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
9
Because (x 2 − y 2 ) + (y 2 − z 2 ) = x 2 − z 2 , we see that (2.4) is equal to an expression
of the form R 3 (α, L, N , g) for some continuous compactly supported function g ∶
R2 Ð→ [0, 1]. Indeed,
⎧
max(0, 1 − w 1 − w 2 )
⎪
⎪
⎪
⎪
⎪
⎪
⎪max(0, 1 − max(w 1 , −w 2 ))
g(w 1 , w 2 ) = ⎨
⎪
max(0, 1 − max(−w 1 , w 2 ))
⎪
⎪
⎪
⎪
⎪
⎪
⎩max(0, 1 + w 1 + w 2 )
w 1 , w 2 ⩾ 0,
w 1 ⩾ 0, w 2 ⩽ 0,
w 1 ⩽ 0, w 2 ⩾ 0,
w 1 , w 2 ⩽ 0.
An elementary calculation then demonstrates that
∞ ∞
∫ ∫ g(w 1 , w 2 ) dw 1 dw 2 = 1.
−∞ −∞
Therefore, by Theorem 1.5, if L > N 4 +ε , then, for almost all α ∈ [0, 1], expression (2.4)
∎
is equal to L 3 (1 + o α,ε (1)) as N → ∞. So Theorem 1.4 is proved.
We make the usual remark that, by approximating the continuous function g above
and below by step functions, to prove Theorem 1.5, it will be enough to prove the
following result.
1
Theorem 2.1 Let ε > 0. Then, for almost all α ∈ [0, 1], for all s 1 , t 1 , s 2 , t 2 ∈ R for which
s 1 < t 1 and s 2 < t 2 , and for all L in the range N 1/4+ε < L < N 1−ε , we have
R 3 (α, L, N , g s,t ) = (1 + o α,ε,s,t (1))L 2 (t 1 − s 1 )(t 2 − s 2 ),
as N → ∞, where s = (s 1 , s 2 ), t = (t 1 , t 2 ), and g s,t is the indicator function of the box
[s 1 , t 1 ] × [s 2 , t 2 ].
It will turn out to be crucial in our subsequent methods that the ratio log L/ log N
does not vary too wildly. To finish this section, we will show how to deduce Theorem
2.1 from the following weaker result.
Theorem 2.2 Let β ∈ (0, 43 ) and let η > 0. Then, if η max(β −1 , ( 43 − β)−1 ) is small
enough, the following holds: for almost all α ∈ [0, 1], for all s 1 , t 1 , s 2 , t 2 ∈ R for which
s 1 < t 1 and s 2 < t 2 , for all N ∈ N, and for all L ∈ R in the range N 1−β−η < L < N 1−β+η ,
we have
R 3 (α, L, N , g s,t ) = (1 + o α,β,η,s,t (1))L 2 (t 1 − s 1 )(t 2 − s 2 ),
as N → ∞, where s = (s 1 , s 2 ), t = (t 1 , t 2 ), and g s,t is the indicator function of the box
[s 1 , t 1 ] × [s 2 , t 2 ]. The error term is independent of the exact choice of the parameters L.
Proof Let ε > 0 and choose η > 0 such that ηε−1 is suitably small. Let {β 1 , . . . , β R }
be a maximal η-separated subset of [ε, 43 − ε]. Then, for each β i , we get a full-measure
set Ω i ⊂ [0, 1] such that if α ∈ Ω i , the conclusion of Theorem 2.2 holds with β = β i and
with η as chosen. We claim that Ω = ∩ i⩽R Ω i is a suitable full-measure set of values of
α in Theorem 2.1.
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
10
N. Technau and A. Walker
Indeed, let α ∈ Ω and let s 1 , t 1 , s 2 , t 2 ∈ R with s 1 < t 1 and s 2 < t 2 . From Theorem 2.2,
we know that, for all i ⩽ R, for any δ > 0, for all N ⩾ N 0 (α, β i , η, δ, s, t), and for any L
in the range N 1−β i −η < L < N 1−β i +η , we have
∣R 3 (α, L, N , g s,t ) − L 2 (t 1 − s 1 )(t 2 − s 2 )∣ < δL 2 .
(2.5)
Now, given any N and any L in the range N 1/4+ε < L < N 1−ε , there exists an i such that
N 1−β i −η < L < N 1−β i +η . Therefore, if N ⩾ max i⩽R N 0 (α, β i , η, δ, s, t), the inequality
(2.5) holds. Because δ is arbitrary, the conclusion of Theorem 2.1 holds.
∎
3 Proof of Theorem 2.2
Our task is now to prove Theorem 2.2. Let us fix β and η, which is assumed to be
sufficiently small, and for the time being, let us also fix some α ∈ [0, 1] and some
s 1 , t 1 , s 2 , t 2 ∈ R with s 1 < t 1 and s 2 < t 2 . We may also assume, without loss of generality,
that N is sufficiently large in terms of α, β, η, s, and t.
We begin by replacing α with a suitably good rational approximation a/q. Assume
that there exists some rational a/q, with q prime, for which
a
1
∣α − ∣ ⩽ 2−η
q
q
(3.1)
and
(3.2)
N
2+β
2 +10η
⩽ q ⩽ 2N
2+β
2 +10η
.
We will show in Section 4 that almost all α admit such an approximation. Then, for
such a pair (N , q), we define
(3.3) A(N , q, c 1 , c 2 ) ∶= ∣{x, y, z ⩽ N ∶ x 2 − y 2 ≡ c 1 (mod q), y 2 − z 2 ≡ c 2 (mod q)}∣.
We then claim that
(3.4)
1
N
∑
(r 1 ,r 2 )∈S
r 1 r 2 ≠0
r 1 +r 2 ≠0
−
A(N , q, ar 1 , ar 2 ) ⩽ R 3 (α, L, N , g s,t ) ⩽
1
N
∑
(r 1 ,r 2 )∈S +
r 1 r 2 ≠0
r 1 +r 2 ≠0
A(N , q, ar 1 , ar 2 ),
where a denotes the inverse of a modulo q,
(3.5)
S − = {(r 1 , r 2 ) ∈ Z2 ∶
s i qL
N2
t i qL
N2
+ 1−η ⩽ r i ⩽
− 1−η , i = 1, 2} ,
N
q
N
q
S + = {(r 1 , r 2 ) ∈ Z2 ∶
s i qL
N2
t i qL
N2
− 1−η ⩽ r i ⩽
+ 1−η , i = 1, 2} .
N
q
N
q
and
(3.6)
Indeed, given x, y in the range 1 ⩽ x, y ⩽ N, let us consider r 1 ∈ Z to be defined by
the relation
a(x 2 − y 2 ) ≡ r 1 (mod q)
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
11
and −q/2 < r 1 < q/2. Suppose that
s 1 qL
N2
t 1 qL
N2
+ 1−η ⩽ r 1 ⩽
− 1−η .
N
q
N
q
Then, we have the inequalities
a
r1
N2
L
a
{α(x 2 − y 2 )}sgn ⩽ { (x 2 − y 2 )} + ∣(α − ) (x 2 − y 2 )∣ ⩽ + 2−η ⩽ t 1
q
q
q
q
N
sgn
and
a
a
r1
N2
L
{α(x 2 − y 2 )}sgn ⩾ { (x 2 − y 2 )} − ∣(α − ) (x 2 − y 2 )∣ ⩾ − 2−η ⩾ s 1 .
q
q
q q
N
sgn
Suppose instead that
L
L
⩽ {α(x 2 − y 2 )}sgn ⩽ t 1 .
N
N
Then, similarly to the above, we have
s1
a
a
Lq
N2
r 1 = q { (x 2 − y 2 )} ⩽ q ({α(x 2 − y 2 )}sgn + ∣(α − ) (x 2 − y 2 )∣) ⩽ t 1
+ 1−η
q
q
N q
sgn
and
a
Lq
N2
a
− 1−η .
r 1 = q { (x 2 − y 2 )} ⩾ q ({α(x 2 − y 2 )}sgn − ∣(α − ) (x 2 − y 2 )∣) ⩾ s 1
q
q
N q
sgn
Finally, take 1 ⩽ z ⩽ N and define r 2 by the relation
a(y 2 − z 2 ) ≡ r 2 (mod q)
with −q/2 < r 2 < q/2. Then, because N < q/2, we have that x, y, z are distinct if and
only if r 1 r 2 ≠ 0 and r 1 + r 2 ≠ 0.
From all these observations taken together, claim (3.4) is settled.
Remark: The reader might find it helpful to note, at this early stage, that the relative
sizes of N and q were chosen, so that the two terms qL/N and N 2 /q 1−η which appear
in (3.5) and (3.6) are of approximately the same magnitude, namely q(2−β)/(2+β) .
A substantial portion of this paper will involve estimating the quantity
A(N , q, c 1 , c 2 ), on average over c 1 and c 2 . To this end, we let
A 0 (q, c 1 , c 2 ) = ∣{x, y, z ⩽ q ∶ x 2 − y 2 ≡ c 1 (mod q), y 2 − z 2 ≡ c 2 (mod q)}∣
be the number of solutions to the key congruences, in which the variables x, y, z may
range over the entire field Z/qZ. One might reasonably expect that
A(M, q, c 1 , c 2 ) ≈ (M/q)3 A 0 (q, c 1 , c 2 )
as long as M is large enough, and so we introduce the difference
3
RR
RR
RR
RR
M
(3.7)
∆(M, q, c 1 , c 2 ) ∶= RRRRA(M, q, c 1 , c 2 ) − ( ) A 0 (q, c 1 , c 2 )RRRR .
q
RRR
RRR
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
12
N. Technau and A. Walker
The following key technical lemma will be proved in Section 5.
Lemma 3.1 If q is an odd prime and M < q, then
(3.8)
2
3 3
6 2
∑ ∆(M, q, c 1 , c 2 ) ≪ (log q) M + (log q) q ,
c 1 ,c 2 ⩽q
as q → ∞.
Lemma 3.1 can be used, in turn, in Section 6 to prove the following lemma on the
average size of the error terms ∆(N , q, ar 1 , ar 2 ), which is helpful for analyzing the
relation (3.4).
Lemma 3.2 Let β ∈ (0, 43 ) and let η > 0. Then, if η max(β −1 , ( 43 − β)−1 ) is small
enough, the following holds: for almost all α ∈ [0, 1], for all s 1 , t 1 , s 2 , t 2 ∈ R for which
s 1 < t 1 and s 2 < t 2 , for all N ∈ N, and for all L ∈ R in the range N 1−β−η < L < N 1−β+η ,
and for all prime q and a/q satisfying (3.1) and (3.2), we have
(3.9)
1
N
∑
(r 1 ,r 2 )∈S
r 1 r 2 ≠0
r 1 +r 2 ≠0
+
∆(N , q, ar 1 , ar 2 ) ≪α,β,η,s,t L 2 q−η ,
where S + is as defined in (3.6).
At this point, it is worth us taking a small diversion from the main proof to discuss
the numerology in Lemma 3.1, and why the bound is close to best possible. We begin
with an easy lemma concerning A 0 (q, c 1 , c 2 ) itself. A more sophisticated version of
this lemma was worked out by Rudnick and Kurlberg in [22, Proposition 4], but to
make our exposition self-contained, we decided to include a direct proof for the triple
correlation case.
Lemma 3.3 Let q be an odd prime. If c 1 , c 2 are not both divisible by q, then
√
⎧
⎪
q + O( q)
⎪
⎪
⎪
⎪
⎪
⎪2q − 1 − ( −c 2 )
⎪
⎪
q
A 0 (q, c 1 , c 2 ) = ⎨
c1
⎪
2q − 1 − ( q )
⎪
⎪
⎪
⎪
⎪
⎪
⎪2q − 1 − ( c 2 )
⎪
q
⎩
if c 1 + c 2 ≡/ 0 (mod q),
if c 1 ≡ 0 (mod q),
if c 2 ≡ 0 (mod q),
if c 1 + c 2 ≡ 0 (mod q),
where ( q⋅ ) denotes the Legendre symbol modulo q. Furthermore, A 0 (q, 0, 0) = 4q − 3.
Remark: For this paper, it would have been enough to consider only the nondegenerate case of Lemma 3.3. However, the arguments of Section 5 are cleaner if we allow
ourselves to include the degenerate cases, which is the reason why we do so.
Proof In the trivial case c 1 ≡ c 2 ≡ 0 (mod q), we note that y = ±x (mod q) and z =
±y (mod q). So, except for when x = q, there are four choices for y, z given a fixed x,
thus showing that A q (0, 0) = 4(q − 1) + 1 = 4q − 3 as claimed.
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
13
Thus, we assume in the following that at least one of c 1 , c 2 is not divisible by q. We
have that
1
e q ( j(x 2 − y 2 − c 1 ) + k(y 2 − z 2 − c 2 )).
A 0 (q, c 1 , c 2 ) = 2
∑
q 0⩽ j,k⩽q−1
0⩽x , y,z⩽q−1
The terms when j = 0, k = 0, or j = k contribute
⎛
⎞
1 ⎜ 2
3⎟
2
2
⎜q
1 − 2q ⎟ ,
1+q
1+q
∑
∑
∑
⎟
q2 ⎜
0⩽x , y⩽q−1
0⩽y,z⩽q−1
0⩽x ,z⩽q−1
⎝ y 2 ≡x 2 −c 1 (mod q)
⎠
y 2 ≡z 2 +c 2 (mod q)
x 2 =z 2 +c 1 +c 2
where the final term is used to correct for the overcounting of ( j, k) = (0, 0). By factorizing the ranges of summation using the difference of two squares, this expression
is equal to
q + (q − 1)(1q∣c 1 + 1q∣c 2 + 1q∣(c 1 +c 2 ) ).
Recall the Gauss sum evaluation
j
√
e q ( jx 2 ) = ( ) ε q q,
q
0⩽x⩽q−1
∑
(3.10)
when q ∤ j and where ε q = 1 if q ≡ 1 (mod 4) and ε q = i if q ≡ 3 (mod 4) (see [13,
Theorem 3.3]). Therefore, the contribution from the remaining frequencies j, k is
exactly
ε 3q
q 1/2
k−j
−k
j
) ( ) e q (− jc 1 )e q (−kc 2 ).
( )(
q
q
q
1⩽ j,k⩽q−1
∑
j≠k
By introducing a change of variables k = l j, this expression is equal to
(3.11)
ε 3q
q 1/2
(
−1
j
l
l −1
) ∑ (
) ( ) ∑ ( ) e q ( j(−c 1 − l c 2 )).
q 2⩽l ⩽q−1 q
q 1⩽ j⩽q−1 q
One can evaluate the inner sum of (3.11) using (3.10). Indeed, for an arbitrary integer
m and an arbitrary quadratic nonresidue h, we have
⎞
j
1⎛
2
2
∑ ( ) e q ( jm) =
∑ e q (mn ) − ∑ e q (mn h)
2 ⎝1⩽n⩽q−1
⎠
1⩽n⩽q−1
1⩽ j⩽q−1 q
=
=
⎞
1⎛
2
2
∑ e q (mn ) − ∑ e q (mn h)
2 ⎝0⩽n⩽q−1
⎠
0⩽n⩽q−1
1
m
hm
√
(( ) − (
)) ε q q
2
q
q
=(
m
√
) ε q q.
q
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
14
N. Technau and A. Walker
Plugging this expression into (3.11), we establish that (3.11) is equal to
∑ (
2⩽l ⩽q−1
l(l − 1)(l c 2 + c 1 )
).
q
√
This is O( q) by Hasse (see [13, equation (14.32)]), provided that neither q∣c 1 , q∣c 2 ,
nor q∣(c 1 + c 2 ).
In the singular cases, if q divides c 1 , we end up with
(
l −1
c2
) ∑ (
),
q 2⩽l ⩽q−1 q
which is equal to −( −cq 2 ). If q∣(c 1 + c 2 ), we end up with
(
c2
l
) ∑ ( ),
q 2⩽l ⩽q−1 q
which is equal to −( cq2 ). The final case, when q∣c 2 , follows easily from the case q∣c 1
after permuting the variables x, y, z in the original expression for A 0 (q, c 1 , c 2 ).
∎
Therefore, (M/q)3 A 0 (q, c 1 , c 2 ) ≍ M 3 q−2 , and this is the expected size of
A(M, q, c 1 , c 2 ). We note, then, that the M 3 term in (3.8) represents “square-root
cancellation on average” for the size of ∆(M, q, c 1 , c 2 ).
Moreover, Lemma 3.1 is close to best possible, at least for certain ranges of M.
We would like to emphasize that here and throughout, M denotes a positive integer.
Indeed, if M < λq 2/3 and λ is a suitably small constant, then we have the matching
lower bound
2
3
∑ ∆(M, q, c 1 , c 2 ) ≫ M .
(3.12)
c 1 ,c 2 ⩽q
Proof Certainly,
3
∑ A(M, q, c 1 , c 2 ) = M .
c 1 ,c 2 ⩽q
Furthermore,
2
∑ A(M, q, c 1 , c 2 ) =
c 1 ,c 2 ⩽q
∑
1.
x , y,z,x ′ , y ′ ,z ′ ⩽M
x 2 −(x ′ )2 ≡y 2 −(y ′ )2 ≡z 2 −(z ′ )2 (mod q)
By using the divisor bound to control the terms arising when x 2 − (x ′ )2 ≠ 0, this sum
is at most O(q o(1) M 2 q−1 + M 3 ), which is certainly at most O(M 3 ).
Therefore, by Cauchy–Schwarz,
⎛
⎞ ⎛
2⎞
⩾ ∑ A(M, q, c 1 , c 2 )
∑ A(M, q, c 1 , c 2 )
⎝c 1 ,c 2 ⩽q
⎠ ⎝c 1 ,c 2 ⩽q
⎠
2
∑ 1 A(M,q,c 1 ,c 2 )⩾1
c 1 ,c 2 ⩽q
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
−1
≫ M3.
On the triple correlations of fractional parts of n 2 α
If Mq−2/3 is sufficiently small, then (M/q)3 A 0 (q, c 1 , c 2 ) <
15
1
2
for all c 1 , c 2 . Hence,
3
∑ ∆(M, q, c 1 , c 2 ) ≫ ∑ 1 A(M,q,c 1 ,c 2 )⩾1 ≫ M
2
c 1 ,c 2 ⩽q
c 1 ,c 2 ⩽q
as claimed.
∎
Rudnick, Sarnak, and Zaharescu [24] also considered ∆(M, q, c 1 , c 2 ). By Fourierexpanding the cutoff x, y, z ⩽ M, they derived
RR
RRRR
RRR
RRR
RRR
R
3
RRR
RR
∆(M, q, c 1 , c 2 ) ⩽
e q (b 1 x + b 2 y + b 3 z)RRRR ,
∑
∏ ∣̂1[M] (b i )∣ RRRR
∑
RRR
RR 0⩽x , y,z⩽q−1
0⩽b 1 ,b 2 ,b 3 ⩽q−1 i=1
RRR 2 2
RRR
b≠0
RRRx 2 −y2 ≡c 1 (mod q)
RRR
RR y −z ≡c 2 (mod q)
RR
where
1
(3.13)
1̂
∑ e q (−bx).
[M] (b) =
q x⩽M
In expression (9.15) of [24] they used the Weil bound,5 ending up with a bound of
(3.14)
∆(M, q, c 1 , c 2 ) ≪ q 1/2 (log q)3 ,
in the nondegenerate cases. Comparing this result to Lemma 3.1, estimate (3.14)
implies
(3.15)
2
3+o(1)
,
∑ ∆(M, q, c 1 , c 2 ) ≪ q
c 1 ,c 2 ⩽q
which is weaker than Lemma 3.1. We note, therefore, that the proof of Lemma 3.1 must
utilize the extra averaging in c 1 and c 2 in a critical way.
In the introduction, we promised to explain how the threshold N 3/5 in Theorem
1.7 arises from the arguments of [24], and now seems to be an appropriate moment.
Indeed, in order to analyze (3.4), it would be enough to show that
∆(N , q, ar 1 , ar 2 ) = o((N/q)3 A 0 (q, ar 1 , ar 2 )),
because then one could replace A(N , q, ar 1 , ar 2 ) with (N/q)3 A 0 (q, ar 1 , ar 2 ) (and
then evaluate these latter terms explicitly using Lemma 3.3). Using the bound (3.14),
this is only possible when N 3 q−2 > q 1/2 (log q)3 , that is, provided that N ⩾ q 5/6+o(1) .
This, one notes, is the same as the threshold from Theorem 4 of [24] taken with m = 3
2
(for triple correlations). Noting that N ≈ q 2+β , this approach succeeds provided that
2/(2 + β) > 5/6, that is, provided that β < 2/5. From the definition of β, this implies
that L must be at least N 3/5 .
Having finished our diversion on the subject of Lemma 3.1 (whose proof is deferred
to Section 5), let us return to the main argument, namely the proof of Theorem
5
For triple correlations, the relevant curve has genus 1, so this is, in fact, the same Hasse bound as
we used in Lemma 3.3.
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
16
N. Technau and A. Walker
2.2. From now on, we assume α satisfies Lemma 3.2. Putting this information into
expression (3.4), we derive
N 2 q−3
∑
(r 1 ,r 2 )∈S
r 1 r 2 ≠0
r 1 +r 2 ≠0
−
(3.16)
A 0 (q, ar 1 , ar 2 ) − O α,β,η,s,t (L 2 q−η ) ⩽ R′3 (α, L, N , g s,t )
⩽ N 2 q−3
∑
(r 1 ,r 2 )∈S
r 1 r 2 ≠0
r 1 +r 2 ≠0
+
A 0 (q, ar 1 , ar 2 ) + O α,β,η,s,t (L 2 q−η ).
To estimate the terms in the expression (3.16), we use Lemma 3.3. Indeed,
(3.17)
∑
N 2 q−3
(r 1 ,r 2 )∈S
r 1 r 2 ≠0
r 1 +r 2 ≠0
±
A 0 (q, ar 1 , ar 2 ) = N 2 q−3
∑
(r 1 ,r 2 )∈S
r 1 r 2 ≠0
r 1 +r 2 ≠0
±
(q + O(q 1/2 )).
The size of S ± is given by
((t 1 − s 1 )
N2
qL
N2
qL
± 2 1−η + O(1)) ((t 2 − s 2 )
± 2 1−η + O(1)) .
N
q
N
q
From relation (3.2), one observes that
N 2 q−1+η = o β,η (
qL
),
N
as N → ∞, and therefore, the size of S ± is seen to be
∣S ± ∣ = (1 + o β,η,s,t (1))(t 1 − s 1 )(t 2 − s 2 )q 2 L 2 N −2 ,
(3.18)
as N → ∞. The estimate (3.18) remains after we remove those pairs (r 1 , r 2 ) ∈ S ± with
r 1 r 2 = 0 or r 1 + r 2 = 0.
Thus, returning to (3.17), we conclude that
N 2 q−3
∑
(r 1 ,r 2 )∈S ±
r 1 r 2 ≠0
r 1 +r 2 ≠0
A 0 (q, ar 1 , ar 2 ) = (1 + o β,η,s,t (1))L 2 (t 1 − s 1 )(t 2 − s 2 ).
Substituting this estimate into (3.16), we derive Theorem 2.2 as required.
∎
What remains is to verify that almost all α admit an approximation a/q of the form
required in (3.1) and (3.2), and to prove Lemmas 3.1 and 3.2.
4 Approximation with prime denominator
To derive a suitable approximation a/q with prime denominator, we use the following
quantitative version of (a generalized) Khintchine’s theorem, due to Harman.
Theorem 4.1 [11, Theorem 4.2]
that
Let ψ ∶ N → (0, 1) be a nonincreasing function such
Ψ (N) = ∑ ψ (n)
n≤N
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
17
is unbounded. For B an infinite set of integers, let S (B, α, N) denote the number of
n ≤ N, with n ∈ B, such that ∥nα∥ < ψ (n). Then, for almost all α, we have
S (B, α, N) = 2Ψ(N , B) + O ε ((Ψ(N)) 2 (log Ψ(N))2+ε ),
1
(4.1)
for each ε > 0, where
Ψ (N , B) =
∑
n∈B∩[N]
ψ (n) .
Moreover, the implied constant in (4.1) is uniform in α.
From this, we deduce the following lemma.
Lemma 4.2 Let η ∈ (0, 1). Then, for almost all α and for all N ⩾ N 0 (η), there is a
prime q satisfying
(4.2)
∥qα∥ <
1
,
N 1−η
and
N ⩽ q ≤ 2N .
Proof We let B denote the set of primes, and ψ (n) = n−1+η . Then, Ψ(N) ∼ η−1 N η
and (by the prime number theorem) Ψ(N , B) ∼ η−1 N η (log N)−1 . So, the asymptotic
formula (4.1) shows that, for almost all α, one has that
S (B, α, N) =
η
2N η (1 + o (1))
+ O(η−1 N 2 (log N)3 ),
η log N
with the implied constant uniform in α. From this, it immediately follows that if N is
sufficiently large in terms of η, then
S(B, α, 2N) − S(B, α, N) > 0,
as required.
∎
Therefore, an approximation a/q may be found, with q prime, that satisfies (3.1)
and (3.2).
5 Proof of Lemma 3.1
Expanding the square, we have
2
∑ ∆(N , q, c 1 , c 2 ) = S 1 − 2S 2 + S 3 ,
c 1 ,c 2 ⩽q
where
S 1 = ∑ A(M, q, c 1 , c 2 )2 ,
c 1 ,c 2 ⩽q
M
S2 = ( )
q
3
S3 = (
6
M
)
q
∑ A(M, q, c 1 , c 2 )A 0 (q, c 1 , c 2 ),
c 1 ,c 2 ⩽q
2
∑ A 0 (q, c 1 , c 2 ) .
c 1 ,c 2 ⩽q
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
18
N. Technau and A. Walker
We can write each S i as the number of solutions to certain equations, namely
S1 =
∑
1,
1⩽x , y,z⩽M
1⩽x ′ , y ′ ,z ′ ⩽M
2
2
x −y ≡(x ′ )2 −(y ′ )2 (mod q)
y 2 −z 2 ≡(y ′ )2 −(z ′ )2 (mod q)
M
S2 = ( )
q
3
S3 = (
6
M
)
q
∑
1,
∑
1.
1⩽x , y,z⩽M
1⩽x ′ , y ′ ,z ′ ⩽q
x 2 −y 2 ≡(x ′ )2 −(y ′ )2 (mod q)
y 2 −z 2 ≡(y ′ )2 −(z ′ )2 (mod q)
1⩽x , y,z⩽q
1⩽x ′ , y ′ ,z ′ ⩽q
x 2 −y 2 ≡(x ′ )2 −(y ′ )2 (mod q)
y 2 −z 2 ≡(y ′ )2 −(z ′ )2 (mod q)
Expanding the cutoffs 1 ⩽ x, y, z, x ′ , y′ , z ′ ⩽ M in terms of additive characters, we
have
S1 =
b=(b 1 ,b 2 ,b 3 ,b 4 ,b 5 ,b 6 ),
0⩽b 1 ,b 2 ,b 3 ,b 4 ,b 5 ,b 6 ⩽q−1
where
∑
S(b, q) ∶=
S(b, q) ∏ 1̂
[M] (b i ),
6
∑
x , y,z⩽q
x ′ , y ′ ,z ′ ⩽q
′ 2
x −y ≡(x ) −(y ′ )2 (mod q)
y 2 −z 2 ≡(y ′ )2 −(z ′ )2 (mod q)
2
i=1
e q (b ⋅ (x, y, z, x ′ , y′ , z ′ ))
2
and 1̂
[M] is as in (3.13). The contribution from the term with b = 0 is equal to S 3 .
Performing the same expansion on S 2 , we see that the terms arising from b = 0 cancel,
and we are left with the bound
(5.1)
S 1 − 2S 2 + S 3 ⩽
∑
(∏ ∣1̂
[M] (b i )∣) ∣S(b, q)∣.
6
0⩽b 1 ,b 2 ,b 3 ,b 4 ,b 5 ,b 6 ⩽q−1
b≠0
i=1
Our task moves to bounding ∣S(b, q)∣. We have the trivial bound
∣S(b, q)∣ ⩽ q 6 ,
but because q is prime and b ≠ 0, we will be able to improve on this bound
substantially.
Our approach will be elementary. We begin with writing
1
′ ′ ′
S(b, q) = 2 ∑
∑ e q (b ⋅ (x, y, z, x , y , z )
q j,k⩽q x , y,z⩽q
x ′ , y ′ ,z ′ ⩽q
(5.2)
+ j(x 2 − y 2 − (x ′ )2 + (y′ )2 )
+ k(y 2 − z 2 − (y′ )2 + (z ′ )2 )).
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
19
As we did when estimating A 0 (q, c 1 , c 2 ), let us first consider the contribution from
those terms when j = 0, k = 0, or j = k. This is
1
∑
q 2 k⩽q
+
+
(5.3)
−
∑
x , y,z⩽q
x ′ , y ′ ,z ′ ⩽q
1
∑
q 2 j⩽q
1
∑
q 2 j⩽q
2
q2
e q (b ⋅ (x, y, z, x ′ , y′ , z ′ ) + k(y 2 − z 2 − (y′ )2 + (z ′ )2 ))
∑
e q (b ⋅ (x, y, z, x ′ , y′ , z ′ ) + j(x 2 − y 2 − (x ′ )2 + (y′ )2 ))
∑
e q (b ⋅ (x, y, z, x ′ , y′ , z ′ ) + j(x 2 − z 2 − (x ′ )2 + (z ′ )2 ))
x , y,z⩽q
x ′ , y ′ ,z ′ ⩽q
x , y,z⩽q
x ′ , y ′ ,z ′ ⩽q
∑
x , y,z⩽q
x ′ , y ′ ,z ′ ⩽q
e q (b ⋅ (x, y, z, x ′ , y′ , z ′ )).
The first three of these terms devolve into the exponential sums involved in the pair
correlations of the fractional parts of αn 2 considered by Heath-Brown in [12]. Indeed,
note that, by completing the square in the variables y, y′ , z, z ′ , we have
(5.4)
1
∑
q 2 k⩽q
∑
x , y,z⩽q
x ′ , y ′ ,z ′ ⩽q
e q (b ⋅ (x, y, z, x ′ , y′ , z ′ ) + k(y 2 − z 2 − (y′ )2 + (z ′ )2 ))
is equal to
1q∣b 1 1q∣b 4 ∑ ∣G(k)∣4 e q (
k⩽q−1
−b 22 + b 32 + b 52 − b 62
),
4k
where
G(k) = ∑ e q (kx 2 )
x⩽q
is the Gauss sum, as before, and 1q∣b abbreviates the indicator function of the condition
q ∣ b. We also use k1 to refer to the multiplicative inverse of k modulo q. Because
∣G(k)∣ = q 1/2 by the standard evaluation (3.10), the term (5.4) is equal to
(5.5)
⎧
q3 − q2
⎪
⎪
⎪
⎪ 2
⎨−q
⎪
⎪
⎪
⎪
⎩0
if q∣b 1 , q∣b 4 , q∣(−b 22 + b 32 + b 52 − b 62 ),
if q∣b 1 , q∣b 4 , q ∤ (−b 22 + b 32 + b 52 − b 62 ),
if q ∤ b 1 or q ∤ b 4 .
We compute that the overall size of (5.3) is
(5.6)
⎧
O(q 3 )
⎪
⎪
⎪
⎪
⎪O(q 3 )
⎪
⎪
⎨
⎪
O(q 3 )
⎪
⎪
⎪
⎪
2
⎪
⎪
⎩O(q )
if q∣b 1 , q∣b 4 , q∣(−b 22 + b 32 + b 52 − b 62 ),
if q∣b 2 , q∣b 5 , q∣(−b 32 + b 12 + b 62 − b 42 ),
if q∣b 3 , q∣b 6 , q∣(−b 12 + b 22 + b 42 − b 52 ),
otherwise.
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
20
N. Technau and A. Walker
Now, consider the contribution to (5.2) when j ≠ 0, k ≠ 0, and j ≠ k. By completing
the square again, this contribution is equal to
1
q2
2
2
2
∑ ∣G( j)∣ ∣G(k − j)∣ ∣G(−k)∣
j,k⩽q−1
j≠k
⋅ e q (−
(5.7)
b 12
b 22
b2 b2
b 52
b2
−
− 3 + 4+
+ 6 ).
4 j 4(k − j) 4k 4 j 4(k − j) 4k
This is equal to
q ∑ eq (
j,k⩽q−1
j≠k
f ( j, k)
),
g( j, k)
where
f ( j, k) = j2 (b 32 − b 62 ) + jk(b 12 − b 22 − b 32 − b 42 + b 52 + b 62 ) + k 2 (−b 12 + b 42 )
and
g( j, k) = 4 jk(− j + k).
By letting l = j/k (mod q), we can reparameterize this exponential sum as
q ∑ e q (−
(5.8)
k⩽q−1
2⩽l ⩽q−1
f (l , 1)
).
kg(l , 1)
This, in turn, is equal to
(5.9)
q(q − 1)R − q(q − 2 − R),
where R is the number of ℓ in the range 2 ⩽ ℓ ⩽ q − 1 for which f (ℓ, 1) = 0 (mod q).
There are two relevant cases. If q ∣ (b 12 − b 42 ), q ∣ (b 22 − b 52 ), and q ∣ (b 32 − b 62 ), then R =
q − 2, and so (5.9) is exactly equal to q(q − 1)(q − 2). Otherwise, f (ℓ, 1) is a nonzero
polynomial of degree at most two in ℓ, and so R = O(1). Hence, (5.9) is O(q 2 ).
If we combine our knowledge of the sum (5.8) with our knowledge of the terms
with j = 0, k = 0, j = k from (5.6), this yields:
⎧
⎪
O(q 3 )
⎪
⎪
⎪
⎪
⎪
O(q 3 )
⎪
⎪
⎪
⎪
(5.10) ∣S(b, q)∣ = ⎨O(q 3 )
⎪
⎪
⎪
⎪
O(q 3 )
⎪
⎪
⎪
⎪
2
⎪
⎪
⎩O(q )
if q∣b 1 , q∣b 4 , q∣(−b 22 + b 32 + b 52 − b 62 ),
if q∣b 2 , q∣b 5 , q∣(−b 32 + b 12 + b 62 − b 42 ),
if q∣b 3 , q∣b 6 , q∣(−b 12 + b 22 + b 42 − b 52 ),
if q divides all of (b 12 − b 42 ), (b 22 − b 52 ), and (b 32 − b 62 ),
otherwise.
Plugging this bound into (5.1), we infer that
(5.11)
S 1 − 2S 2 + S 3 ≪ q 3 (T1 + T2 ) + q 2 T3 ,
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
21
where (after a straightforward relabeling of the variables)
T1 =
−q/2<b 1 ,b 2 ,b 3 ,b 4 ,b 5 ,b 6 <q/2
q∣b 1 ±b 4 ,q∣b 2 ±b 5 ,q∣b 3 ±b 6
b≠0
T2 =
∑
−b/2<b 1 ,b 2 ,b 3 ,b 4 <q/2
q∣b 12 −b 22 +b 32 −b 42
b≠0
T3 =
(∏ ∣1̂
[M] (b i )∣) ,
6
∑
i=1
M2 4 ̂
(∏ ∣1[M] (b i )∣) ,
q 2 i=1
(∏ ∣1̂
[M] (b i )∣) .
6
∑
−q/2<b 1 ,b 2 ,b 3 ,b 4 ,b 5 ,b 6 ⩽q/2
b≠0
i=1
Indeed, because q is prime, if q∣b 12 − b 42 , then q∣(b 1 + b 4 ) or q∣(b 1 − b 4 ). This is one
of the places in which the assumption that q is prime is particularly convenient (the
other being the reparameterization of j and k to get (5.8)).
For −q/2 < b < q/2, recall the standard bound
∣1̂
[M] (b)∣ ≪ min (
M 1
, ),
q ∣b∣
which produces the estimate
∑
−q/2<b<q/2
∣1̂
[M] (b)∣ ≪
∑
∣b∣<q/M
M
1
+
≪ log q.
∑
q q/M⩽∣b∣<q/2 ∣b∣
In the sum defining T1 , if (b 1 , b 2 , b 3 ) are fixed, then there are O(1) possibilities for
(b 4 , b 5 , b 6 ). Similarly, in T2 , once (b 1 , b 2 , b 3 ) are fixed, then there are at most two
possibilities for b 4 . Hence,
T1 , T2 ≪
M3
(log q)3
q3
and
T3 ≪ (log q)6 ,
because M < q. Substituting these bounds into (5.11), the proof of Lemma 3.1 is
complete.
∎
6 Proof of Lemma 3.2
Let β and η be as in the statement of Lemma 3.2. We begin the proof with another
auxiliary lemma, which concerns the quantity
∆∗β (q, c 1 , c 2 ) = max2 ∣∆(M, q, c 1 , c 2 )∣.
M⩽q 2+β
Lemma 6.1 Let q be an odd prime. Then, we have
∗
2
o(1)
∑ ∆ β (q, c 1 , c 2 ) ≪ q q 2+β .
7−β
c 1 ,c 2 ⩽q
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
22
N. Technau and A. Walker
Proof We will deduce this result from Lemma 3.1. Taking 1 ⩽ K ⩽ M ⩽ q, we begin
by seeking a bound on
(6.1)
2
∑ (A(M + K, q, c 1 , c 2 ) − A(M, q, c 1 , c 2 )) .
c 1 ,c 2 ⩽q
First, A(M + K, q, c 1 , c 2 ) − A(M, q, c 1 , c 2 ) is at most
∣{x ∈ (M, M + K], y, z ⩽ M + K ∶ x 2 − y 2 ≡ c 1 (mod q), y 2 − z 2 ≡ c 2 (mod q)}∣
+∣{y ∈ (M, M + K], x, z ⩽ M + K ∶ x 2 − y 2 ≡ c 1 (mod q), y 2 − z 2 ≡ c 2 (mod q)}∣
+∣{z ∈ (M, M + K], x, y ⩽ M + K ∶ x 2 − y 2 ≡ c 1 (mod q), y 2 − z 2 ≡ c 2 (mod q)}∣.
Let E 1 (q, c 1 , c 2 ) refer to the first quantity, E 2 (q, c 1 , c 2 ) refer to the second quantity,
and E 3 (q, c 1 , c 2 ) refer to the third quantity. We have that expression (6.1) is at most a
constant times
2
2
2
∑ (E 1 (q, c 1 , c 2 ) + E 2 (q, c 1 , c 2 ) + E 3 (q, c 1 , c 2 ) ).
c 1 ,c 2 ⩽q
By a change of variables, we may reduce consideration just to E 2 . Indeed, if x 2 −
y 2 ≡ c 1 (mod q) and y 2 − z 2 ≡ c 2 (mod q), then z 2 − x 2 ≡ −c 1 − c 2 (mod q). Hence,
E 1 (q, c 1 , c 2 ) = E 2 (q, −c 1 − c 2 , c 1 ), and so
2
2
∑ E 1 (q, c 1 , c 2 ) = ∑ E 2 (q, c 1 , c 2 ) .
c 1 ,c 2 ⩽q
c 1 ,c 2 ⩽q
A similar argument works for E 3 (q, c 1 , c 2 ).
Now, ∑c 1 ,c 2 ⩽q E 2 (q, c 1 , c 2 )2 is equal to
(6.2)
∣{y 1 , y 2 ∈ (M, M + K], x 1 , x 2 , z 1 , z 2 ⩽ M + K ∶
x 12 − y 12 ≡ x 22 − y 22 (mod q),
}∣ .
y 12 − z 12 ≡ y 22 − z 22 (mod q)
Fixing integers k 1 , k 2 and integers y 1 , y 2 ∈ (M, M + K], we will now bound the number of solutions (x 1 , x 2 , z 1 , z 2 ) to the system of equations
x 12 − x 22 = y 12 − y 22 + k 1 q,
(6.3)
z 22 − z 12 = y 22 − y 12 + k 2 q.
If both y 12 − y 22 + k 1 q ≠ 0 and y 22 − y 12 + k 2 q ≠ 0 then, using the divisor bound, we
get q o(1) solutions. If y 12 − y 22 + k 1 q = 0 and y 22 − y 12 + k 2 q ≠ 0, we get O(q o(1) M)
solutions. If both y 12 − y 22 + k 1 q = 0 and y 22 − y 12 + k 2 q = 0, we get O(M 2 ) solutions.
Note that, in such a case, we must have k 1 = −k 2 .
Regarding the other conditions on (y 1 , y 2 , k 1 , k 2 ), we note that the variables k 1 , k 2
are both restricted to intervals of length O(M 2 /q + 1). Furthermore, if k 1 ≠ 0, then the
total number of solutions of y 1 , y 2 , k 1 to y 12 − y 22 = k 1 q, with y 1 , y 2 ∈ (M, M + K], is
O(q o(1) (KM/q + 1)), because ∣y 12 − y 22 ∣ = O(KM). If, however, k 1 = 0, then of course
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
23
y 1 = y 2 , so there are O(K) solutions here. Thus, summing over (y 1 , y 2 , k 1 , k 2 ), we
bound (6.2) above by
⎛
M2
M2
KM
M2
q o(1) K 2 (
+ 1) + M (
+ 1) (
+ 1) + KM (
+ 1)
q
q
q
q
⎝
2
+ M2 (
⎞
KM
+ 1) + KM 2 .
q
⎠
Because K ⩽ M ⩽ q, we may simplify the above and conclude that
(6.4)
2
o(1)
2 4 −2
2
∑ (A(M + K, q, c 1 , c 2 ) − A(M, q, c 1 , c 2 )) ≪ q (K M q + KM ).
c 1 ,c 2 ⩽q
One may think of this upper bound as being given by the sum of an average
contribution and a diagonal contribution.
From (6.4), together with previous lemmas, it turns out that one can control
∑
max ∆(M + H, q, c 1 , c 2 )2 .
1⩽c 1 ,c 2 ⩽q 0⩽H⩽K
Indeed, because
∣(a 1 − a 2 )2 − (b 1 − b 2 )2 ∣ ≪ (a 1 − b 1 )2 + (a 2 − b 2 )2
for all reals a 1 , b 1 , a 2 , b 2 , we have that
∑
(6.5)
max ∆(M + H, q, c 1 , c 2 )2 − ∑ ∆(M, q, c 1 , c 2 )2
c 1 ,c 2 ⩽q 0⩽H⩽K
c 1 ,c 2 ⩽q
is as most a constant times
2
∑ (A(M + K, q, c 1 , c 2 ) − A(M, q, c 1 , c 2 ))
c 1 ,c 2 ⩽q
+ max (
(6.6)
0⩽H⩽K
(M + H)3 − M 3
)
q3
2
2
∑ A 0 (q, c 1 , c 2 ) .
c 1 ,c 2 ⩽q
From expression (6.4) and Lemma 3.3, we conclude that (6.6) is at most
q o(1) (K 2 M 4 q−2 + KM 2 )
again. Finally, using Lemma 3.1 combined with (6.5), we end up with the bound
(6.7)
∑
max ∆(M + H, q, c 1 , c 2 )2 ≪ q o(1) (M 3 + q 2 ) + q o(1) K 2 M 4 q−2 .
c 1 ,c 2 ⩽q 0⩽H⩽K
(The KM 2 term has been absorbed into the M 3 term.)
Before we use (6.7) to prove the lemma, we need the bound
(6.8)
∑
max ∆(H, q, c 1 , c 2 )2 ≪ q o(1) (K 6 q−2 + K 3 ).
c 1 ,c 2 ⩽q 0⩽H⩽K
This may be proved by an identical analysis to the one above, proceeding with M = 0.
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
24
N. Technau and A. Walker
Now, returning to the original object of the lemma, we may cover
{n ∈ N ∶ 1 ⩽ n ⩽ q 2+β }
2
by the interval [1, K] together with at most q 2+β K −1 other intervals, each of the form
[M, M + K] with K ⩽ M. Thus,
2
∗
2
o(1)
6 −2
3
∑ ∆ β (q, c 1 , c 2 ) ⩽ q (K q + K )
c 1 ,c 2 ⩽q
∑
+
l ⩽q
⩽q
o(1)
2
2+β
∑
K −1
max
c 1 ,c 2 ⩽q X∈[l K ,(l +1)K)
(K q
6 −2
+ K 3 ) + q o(1)
∆(X, q, c 1 , c 2 )2
∑
2
l ⩽q 2+β K −1
(l 3 K 3 + q 2 + l 4 K 6 q−2 )
⩽ q o(1) (K 6 q−2 + K 3 ) + q o(1) (q 2+β K −1 + q 2+β +2 K −1 + q 2+β −2 K).
8
10
2
Because β ⩽ 1 always, we have q 2+β +2 ⩽ q 2+β , which allows us to reduce matters to the
bound
8
2
q o(1) (K 6 q−2 + K 3 + q 2+β K −1 + q 2+β −2 K).
8
10
Optimizing in K, we find that the minimum value is achieved when K ≍ q 2+β , in which
case the third and fourth terms have the same order of magnitude. We conclude that
1+β
∗
2
o(1)
∑ ∆ β (q, c 1 , c 2 ) ≪ q q 2+β ,
7−β
c 1 ,c 2 ⩽q
and the lemma is proved.
∎
Proof of Lemma 3.2 Let C be a suitably large absolute constant. For all odd prime
q, and integers a such that (a, q) = 1, define
(6.9)
D β,η (a, q) ∶=
∑
r 1 ,r 2 ≠0
2−β
∣r 1 ∣⩽q 2+β
∣r 2 ∣⩽q
∆∗β (q, ar 1 , ar 2 ).
+C η
2−β
+C η
2+β
Now, suppose α ∈ [0, 1] and fractions a/q and N satisfy (3.1) and (3.2). We claim that,
if C is large enough, and if q is large enough in terms of s, t, and η, it follows that
∑
(6.10)
(r 1 ,r 2 )∈S +
r 1 r 2 ≠0
r 1 +r 2 ≠0
∆(N , q, ar 1 , ar 2 ) ⩽ D β,η (a, q).
Indeed, because N ⩽ q 2+β , we have
2
∆(N , q, ar 1 , ar 2 ) ⩽ ∆∗β (q, ar 1 , ar 2 ).
In addition, we have both
2−β
2−β
2+β
2
qL
≪ N 2 −β+11η ≪ q( 2 +11η) 2+β + ≪ q 2+β +C η
N
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
25
and
N 2 q η−1 ≪ N 2−(1−η)(
2+β
2 +10η)
≪ q(
2−β
2
2 +C η) 2+β
2−β
≪ q 2+β +C η .
Referring to the definition (3.6) of S + , we have thus settled the inequality (6.10).
Therefore, to prove Lemma 3.2, it suffices to show that, for almost all α ∈ [0, 1], for
all (a, q, N) satisfying (3.1) and (3.2),
D β,η (a, q)L−2 N −1 ≪α,β,η q−η .
(6.11)
In fact, by (3.2), it suffices to show that, for almost all α ∈ [0, 1] and for all (a, q)
satisfying (3.1),
D β,η (a, q)q
(6.12)
−6+4β
2+β
≪α,β,η q−C η .
This expression makes no mention of N or L, which will be a technical necessity in the
Borel–Cantelli argument to come.
To show (6.12), for each q, we define
Bad β,η (q) = {a ⩽ q ∶ (a, q) = 1, D β,η (a, q)q
−6+4β
2+β
⩾ q−C η }.
It will be enough, then, to show that, for almost all α ∈ [0, 1], only finitely many of the
fractions (a, q) satisfying (3.1) also satisfy a ∈ Bad β,η (q).
To bound the size of Bad β,η (q), we first note that
∣ Bad β,η (q)∣ ⩽ q
−6+4β
2+β +C η
∑ D β,η (a, q).
1⩽a⩽q
(a,q)=1
Furthermore, we define f β,η (q, c 1 , c 2 ) to be the number of triples (a, r 1 , r 2 ) such that
ar 1 ≡ c 1 (mod q),
ar 2 ≡ c 2 (mod q),
for which 1 ⩽ a ⩽ q, (a, q) = 1, r 1 r 2 ≠ 0, and ∣r 1 ∣, ∣r 2 ∣ ⩽ q 2+β +C η . We then have
(6.13)
∣ Bad β,η (q)∣ ⩽ q
−6+4β
2+β +C η
∑
0⩽c 1 ,c 2 ⩽q−1
2−β
f β,η (q, c 1 , c 2 )∆∗β (q, c 1 , c 2 ).
To estimate (6.13), we need the following simple lemma about the size of
f β,η (q, c 1 , c 2 ).
Lemma 6.2 We have
+2C η
2
(q + q 2+β +2C η )q o(1) .
∑ f β,η (q, c 1 , c 2 ) ≪ q 2+β
4−2β
4−2β
c 1 ,c 2 ⩽q
∎
Proof We have that ∑c 1 ,c 2 ⩽q f β,η (q, c 1 , c 2 )2 is equal to the number of pairs of triples
(a, r 1 , r 2 ), (b, s 1 , s 2 ) such that
(6.14)
ar 1 ≡ bs 1 (mod q),
ar 2 ≡ bs 2 (mod q),
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
26
N. Technau and A. Walker
with 1 ⩽ a, b ⩽ q, (a, q) = 1, (b, q) = 1, r 1 r 2 s 1 s 2 ≠ 0 and ∣r 1 ∣, ∣r 2 ∣, ∣s 1 ∣, ∣s 2 ∣ ⩽ q 2+β +C η . By
multiplying the first equation by br 2 and the second equation by br 1 , one sees that, for
every such solution, we must also have
2−β
s 1 r 2 ≡ s 2 r 1 (mod q).
(6.15)
The mapping between the solutions to (6.14) and (6.15) is at most q-to- 1, because if
the variables a, r 1 , r 2 , s 1 , s 2 are fixed, then b is uniquely determined in (6.14).
Solutions to (6.15) are given by solutions (r 1 , r 2 , s 1 , s 2 , k) to
s 1 r 2 − s 2 r 1 = kq,
with k in the range 0 ⩽ ∣k∣ ≪ q 2+β −1+2C η . By the divisor bound, after k, s 1 and r 2 are
fixed there at most q o(1) valid choices of s 2 and r 1 , so the total number of solutions to
(6.15) is at most
2(2−β)
q 2+β +2C η (1 + q 2+β −1+2C η ).
4−2β
4−2β
Multiplying by q to get the number of solutions to (6.14), we prove the lemma.
∎
(The reader may wish to note that if β > 2/3, then the only valid value of k in the
above is k = 0, which simplifies the remainder of the analysis for these cases.)
Returning to (6.13), we have
∣ Bad β,η (q)∣ ⩽ q
−6+4β
2+β +C η
≪q
−6+4β
2+β
1/2
⎛
⎞
⋅ ∑ ∆∗β (q, c 1 , c 2 )2
⎝c 1 ,c 2 ⩽q
⎠
1/2
⋅ (q 2 + 2+β + q 2+β ) ⋅ q 2(2+β) ⋅ q C η+o(1)
1
2−β
4−2β
7−β
≪ (q 2(2+β) + q 2(2+β) )q C η+o(1) .
1+6β
(6.16)
⎛
⎞
⋅ ∑ f η,β (q, c 1 , c 2 )2
⎝c 1 ,c 2 ⩽q
⎠
3+3β
Here, we have used Lemmas 6.1 and 6.2 to go from the first line to the second line.
Now, recall that β < 43 , which implies that
1 + 6β
< 1.
2(2 + β)
The other term is less severe, and, in fact, β < 1 implies
3 + 3β
< 1.
2(2 + β)
Because η is small enough, and C is absolute, we conclude that
(6.17)
∣ Bad(q, β, η)∣ ≪ β,η q 1−2η .
Now, we can finally complete the proof of Lemma 3.2 by using the first Borel–
Cantelli lemma. Indeed, pick α uniformly at random in [0, 1], and for each prime
q ⩾ 3, let E q,β,η be the event that there exists an a with 1 ⩽ a ⩽ q, (a, q) = 1,
1
a
∣α − ∣ < 2−η ,
q
q
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
27
and a ∈ Bad β,η (q). Then,
P(E q,β,η ) ⩽
1 a
1
a
µ (( − 2−η , + 2−η )) ≪ β,η q−1−η .
q q
q q
a∈Bad β,η (q)
∑
Then, ∑q⩾3 P(E q,β,η ) < ∞, and so, with probability 1, only finitely many of the
events E q,β,η occur. Thus, by our long chain of reductions, Lemma 3.2 follows.
7 Concluding remarks
The proof of all of our main theorems is now complete. However, before concluding
the paper, it is certainly worth us discussing whether β < 43 represents a natural limit
of our approach.
The chain of inequalities (6.16) is the critical moment of the entire proof, and this
particular application of Cauchy and Schwarz is the main source of our loss in the
range of β. Suppose that instead we had used the bound
∣ Bad β,η (q)∣ ⩽ q
−6+4β
2+β +C η
⎛
⎞
∑ f β,η (q, c 1 , c 2 )
⎝c 1 ,c 2 ⩽q
⎠
1/2
⎛
⎞
⋅ ∑ f β,η (q, c 1 , c 2 )∆∗β (q, c 1 , c 2 )2
⎝c 1 ,c 2 ⩽q
⎠
(7.1)
1/2
.
For simplicity of exposition here, we will assume that β > 2/3, that C = 0, and we will
ignore all q o(1) terms. It is then easy to see that
1+
∑ f β,η (q, c 1 , c 2 ) ≈ q
2(2−β)
2+β
.
c 1 ,c 2 ⩽q
Combining this bound with Lemma 6.2, one may conclude that f β,η (q, c 1 , c 2 ) ≈ 1 for
q 1+ 2+β pairs (c 1 , c 2 ), and is 0 otherwise. So, given what we know from Lemma 6.1
about the value of ∆∗β (q, c 1 , c 2 )2 averaged over all pairs c 1 and c 2 , it is not utterly
unreasonable to hope that one could prove
2(2−β)
1+
∗
2
∑ f β,η (q, c 1 , c 2 )∆ β (q, c 1 , c 2 ) ≪ β,η q 2+β ⋅ q
7−β
(7.2)
c 1 ,c 2 ⩽q
2(2−β)
2+β
⋅ q−2 = q 2+β ,
9−4β
provided that the weight of ∆∗β (q, c 1 , c 2 )2 does not concentrate on the support of f.
Putting this bound into (7.1), one would then get
∣ Bad β,η (q)∣ ≪ β,η q 2(2+β) ,
3+3β
that is, only the second term from (6.16) would occur. As we have already remarked,
we would then derive
∣ Bad β,η (q)∣ ≪ β,η q 1−2η ,
provided β < 1 and η is small enough. This estimate would expand the range of
Theorem 1.4 all the way to L > N ε . Unfortunately, we have not been able to prove
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
28
N. Technau and A. Walker
a version of Lemma 3.1, which includes the weight f β,η (q, c 1 , c 2 ) in the manner of
expression (7.2).
One also recalls that in our application of Borel and Cantelli, we did not need to
bound ∣ Bad β,η (q)∣ uniformly for all q. One would be satisfied with
∑
q prime
∣ Bad β,η (q)∣
< ∞.
q2
Thoughts move toward expressing the relevant exponential sums as an average of
Kloosterman-type sums over the modulus q, which might be another route for future
research.
Our final remark is that if L → ∞ and N/L → ∞ as N → ∞, then, in the random
model (1.4), the asymptotics are governed by the Central Limit Theorem. One can
derive
Z L,N − L d i st
√
ÐÐ→ N(0, 1),
L
as N → ∞. Theorem 1.4 could then be considered as a first step toward
√ showing that,
for almost all α, the skewness of Wα,L,N satisfies E((Wα,L,N − L)/ L)3 → 0 as N →
∞, with L in a certain range. However, to show this asymptotic, one would need to be
3
able to extract the lower degree main term from EWα,L,N
(which is 3L 2 , as in Section
2), and then subsequently show that the error term in Theorem 2.1 is, in fact, o(L 3/2 ),
rather than merely o(L 3 ).
Appendix A Pair correlations of the dilated squares at
scale N −β
In this section, we briefly indicate how one can deduce the following fact from the
methods of the literature.
Theorem A.1 Let ε ∈ (0, 41 ). Then, for almost all α ∈ [0, 1], for all 1 ⩽ L ⩽ N 1−ε , and for
all (log N)−1 ⩽ s ⩽ log N, we have
R 2 (α, L, N , 1[−s,s] ) = 2Ls(1 + O ε,α (N −ε/13 )).
Note that this result immediately implies the estimate (2.3), by approximating the
function f in (2.3) with a suitable step function.
We have made no attempt to obtain the best possible error term in Theorem A.1,
nor the largest admissible ranges for s and for L. One will observe from the proof that
rather better bounds would certainly follow if one assumed at the outset that L were a
slowly varying function of N.
A version of Theorem A.1 follows from arguments of Aistleitner, Larcher, and
Lewko [2] as well as from arguments of Rudnick [21] and Rudnick and Sarnak [23],
by changing the relevant parameters. If one wanted an explicit characterization of the
set of suitable α in terms of properties of its rational approximations, then one could
also adapt the (much more involved) methods of Heath-Brown [12]. In particular, the
material of the present section is in no way novel. However, we decided to add some
explanations on how to deduce Theorem A.1, partly in order to make the exposition of
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
29
our previous arguments complete and self-contained, and partly in order to describe
explicitly a suitable “sandwiching argument” for this result (expanding upon the
description in [2]).
We begin with the following auxiliary lemma (where again no attempt was made
to obtain the best possible error term).
Lemma A.2 For each m ∈ N, let N m = m 4 . Letting ε ∈ (0, 41 ), for each i in the range
β
−m ε/3 /10 ⩽ i ⩽ m ε/3 , let β m,i = i/m ε/3 and L m,i = N mm,i . Let s m be a real quantity that
−ε/10
ε/10
satisfies m
< sm < m
for large enough m. Then, for almost all α ∈ [0, 1], for all
m, and for all i in the range −m ε/3 /10 ⩽ i ⩽ m ε/3 , we have
L m,i
13 ε ) .
m 15 − 3
Proof of Lemma A.2 Let I = (γ, δ) be an arc on the torus R/Z such that 0 < γ −
δ < 1. Let J ⩾ 1 be an integer. To proceed, we introduce trigonometric polynomials
S ±J (x), of degree J, which approximate the indicator function χ I from above and
below. Selberg, and also Vaaler, constructed such polynomials (cf. Montgomery [19,
pp. 5–6]). Indeed, there exists
R 2 (α, L m,i , N m , 1[−s m ,s m ] ) = 2s m L m,i + O α,ε (
(A.1)
S ±J (x) = ∑ s±J ( j) e( jx),
∣ j∣⩽J
satisfying
S −J (x) ⩽ χ I (x) ⩽ S +J (x)
such that
s±J (0) = δ − γ ±
1
,
J+1
∣s±J ( j)∣ ⩽
(x ∈ R/Z),
1
1
+ min (δ − γ,
)
J+1
π ∣ j∣
(0 < ∣ j∣ ⩽ J) .
For our purposes, given N m and some growth function w(N m ) ⩾ 1, to be specified later, we specify J m,i = ⌊N m w(N m )/L m,i ⌋, γ m,i = −s m L m,i /N m , and δ m,i =
s m L m,i /N m . Note that the definitions of s m and L m,i in the statement of the theorem
imply that these choices yield a valid arc on the torus. Then, defining
R±2 (α, L m,i , N m , 1[−s m ,s m ] ) ∶=
1
Nm
∑
x≠y⩽N m
S ±J m,i (α(x 2 − y 2 )),
we can control R 2 from above and below via
R−2 (α, L m,i , N m , 1[−s m ,s m ] ) ⩽ R 2 (α, L m,i , N m , 1[−s m ,s m ] )
⩽ R+2 (α, L m,i , N m , 1[−s m ,s m ] ).
(A.2)
Let
1
E (L m,i , N m , s m ) ∶= ∫ R±2 (α, L m,i , N m , 1[−s m ,s m ] ) dα
±
0
be the expected value of
(A.3)
R±2 (α, L m,i , N m , 1[−s m ,s m ] ).
It is easy to see that
E (L m,i , N m , s m ) = 2s m L m,i + O(L m,i /w(N m )) + O(s m L m,i /N m ).
±
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
30
N. Technau and A. Walker
Furthermore, by using orthogonality, we also see that the variance
±
2
±
∫ (R 2 (α, L m,i , N m , 1[−s m ,s m ] ) − E (L m,i , N m , s m )) dα
1
0
of
R±2
is at most
⩽
1
2
Nm
∑
∑
x 1 ≠y 1 ⩽N m
0<∣ j 1 ∣,∣ j 2 ∣⩽J m,i
x 2 ≠y 2 ⩽N m j 1 (x 2 −y 2 )+ j 2 (x 2 −y 2 )=0
2
2
1
1
∣s±J m,i ( j 1 )s±J m,i ( j 2 )∣.
(Note that there are no contributions from terms in which j 1 = 0 and j 2 ≠ 0, because
the condition j 2 (x 22 − y 22 ) = 0 cannot be satisfied.)
Because ∣s±J m,i ( j)∣ ⩽ (2s m + 1)L m,i /N m for all j in the range 0 < ∣ j∣ ⩽ J m,i , we can
bound the variance above by O((s m + 1)2 ) times
(A.4)
L 2m,i
4
Nm
RRRR⎧
j 1 (x 12 − y 12 ) = j 2 (x 22 − y 22 ),
⎪
⎪
RRR⎪
⎪
1 ⩽ xk ≠ yk ⩽ Nm ,
⎪
RRR⎪
( j 1 , x 1 , y 1 , j 2 , x 2 , y 2 ) ∈ Z6 ∶
m)
RRR⎨
0 < ∣ j k ∣ ⩽ N m Lw(N
,
⎪
m,i
⎪
RRRR⎪
⎪
⎪
k
=
1,
2
RR⎪
⎩
⎫
RRR
⎪
⎪
RR
⎪
⎪
⎪
⎪RRRR
⎬RR .
RRR
⎪
⎪
⎪
RRR
⎪
⎪
⎪
⎭RR
Let us fix the first three variables above, that is, j 1 , x 1 , y 1 . Then, by writing x 22 −
y 22 = d 1 d 2 , with d 1 = x 2 − y 2 and d 2 = x 2 + y 2 , we deduce that d k ∣ j 1 (x 12 − y 12 ) for
o(1)
k = 1, 2. By the divisor bound, there are at most N m many possibilities for d 1 , d 2
O(1)
(provided that w(N m ) ⩽ N m ). Moreover, any choice of d 1 , d 2 uniquely determines
the variables x 1 , y 1 via x 2 = (d 1 + d 2 )/2 and y 2 = (d 2 − d 1 )/2. Furthermore, we note
o(1)
that j 2 is determined up to ≪ N m many choices. The upshot is that given one of
3
the O(N m w(N m )/L m,i ) many admissible choices for j 1 , x 1 , y 1 , the second block of
o(1)
variables j 2 , x 2 , y 2 is determined up to O(N m ) many possibilities. Therefore, the
−1+o(1)
variance of R±2 is at most O((s m + 1)2 L m,i w(N m )N m
).
±
For the ease of exposition, we let κm,i = ∣R±2 (α, L m,i , N m , 1[−s m ,s m ] ) −
E ± (L m,i , N m , s m )∣. We infer, by Chebychev’s inequality, that
P (α ∈ [0, 1] ∶ κ±m,i ⩾
L m,i
(s m + 1)2 w(N m )3 N m
)⩽
w(N m )
N m L m,i
o(1)
⩽
(s m + 1)2 w(N m )3 N m
Choose the growth function w(N m ) to be
9/10
⎛ Nm
⎞
w(N m ) =
1+ε
⎝m ⎠
1/3
.
Therefore, because s m < m ε/10 , we conclude that
P (α ∈ [0, 1] ∶ κ±m,i ⩾
1
L m,i
) ≪ε 1+ ε .
w(N m )
m 2
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
o(1)
9/10
Nm
.
On the triple correlations of fractional parts of n 2 α
31
Summing over i in the range −m ε/3 /10 ⩽ i ⩽ m ε/3 , with the union bound, we get
P (α ∈ [0, 1] ∶ ∃ i ∈ [−m ε/3 /10, m ε/3 ] s.t. κ±m,i ⩾
L m,i
1
) ≪ε 1+ ε .
w(N m )
m 6
Recalling our estimate (A.3), the first Borel–Cantelli lemma implies that, for
almost every α ∈ [0, 1], the following relation holds for all m ⩾ 1 and all admissible
i ∈ [−m ε/3 /10, m ε/3 ]:
1
Nm
∑
x≠y⩽N m
R±2 (α, L m,i , N m , 1[−s m ,s m ] ) = 2s m L m,i + O α,ε (
sm
L m,i
) + Oα (
).
w(N m )
Nm
From (A.2), and substituting in the explicit growth function w(N m ), the lemma
follows.
∎
Proof of Theorem A.1 We begin with the trivial observation that, by combining s
and L into a single parameter, it is enough to show that, for almost all α ∈ [0, 1], for all
N, and for all L in the range N −1/11 ⩽ L ⩽ N 1−ε/2 ,
(A.5)
R 2 (α, L, N , 1[−1,1] ) = 2L(1 + O α,ε (N − 13 )).
ε
Now, for each N, choose m such that N m ≤ N < N m+1 , where N m = m 4 as in Lemma
A.2. We put θ m = N m+1 /N m . Then, for any L,
N m+1
R 2 (α, L, N m+1 , 1 N m+1 [−1,1] )
N
N
⩽ θ m R 2 (α, L, N m+1 , 1θ m [−1,1] ),
R 2 (α, L, N , 1[−1,1] ) ⩽
and similarly,
R 2 (α, L, N , 1[−1,1] ) ⩾
Nm
R 2 (α, L, N m , 1 N m [−1,1] ) ⩾ θ −1
).
m R 2 (α, L, N m , 1 θ −1
m [−1,1]
N
N
For each N −1/11 ⩽ L ⩽ N 1−ε/2 , there exists an i in the range −m ε/3 /10 ⩽ i ⩽ m ε/3
such that L m,i ⩽ L ⩽ L m,i+1 , where L m,i is as in Lemma A.2. Now, we record that the
upper and lower bounds above satisfy
R 2 (α, L, N m , 1θ −1
) ⩾ R 2 (α, L m,i , N m , 1θ −1
),
m [−1,1]
m [−1,1]
R 2 (α, L, N m+1 , 1θ m [−1,1] ) ⩽ R 2 (α, L m,i+1 , N m+1 , 1θ m [−1,1] ).
Moreover, Lemma A.2 implies that there is a set Ω ε ⊂ [0, 1], with full measure, such
that, if α ∈ Ω ε and ε ∈ (0, 41 ], then for all m ⩾ 1 and i in the range −m ε/3 /10 ⩽ i ⩽ m ε/3 ,
R 2 (α, L m,i , N m , 1θ m [−1,1] ) = 2θ m L m,i (1 + O α,ε (m− 15 + 3 )).
13
ε
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
32
N. Technau and A. Walker
By using that L m,i+1 /L m,i = 1 + O ε (N −ε/13 ) and also that θ m = 1 + O(m−1 ), we
infer that
R 2 (α, L, N , 1[−1,1] ) ⩽ 2L(1 + O α,ε (N − 4 ))(1 + O α,ε (N − 13 ))(1 + O α,ε (m− 15 + 3 ))
ε
1
13
ε
⩽ 2L(1 + O α,ε (N − 13 )).
ε
Similarly, we conclude that
R 2 (α, L, N , 1[−1,1] ) ⩾ 2L(1 + O α,ε (N − 13 )).
ε
Combining these two estimates shows (A.5), thus completing the proof of Theorem A.1.
∎
Appendix B Discrepancy and k-point correlation functions at
scale N −β
The purpose of the present section is to record a few simple observations concerning
the relationship between discrepancy and k-point correlation functions.
Definition B.1 Let (x n )∞
n=1 be a sequence of points in [0, 1). Let k ⩾ 2 be a natural
number, and let g ∶ R k−1 Ð→ [0, 1] be a compactly supported function. Then, the kth
correlation function R k ((x n )∞
n=1 , L, N , g) is defined to be
R k ((x n )∞
n=1 , L, N , g) ∶=
1
N
∑
n 1 ,...,n k ⩽N
distinct
g(
N
N
N
y 1 , y 2 , . . . , y k−1 ) ,
L
L
L
where y i = {x n i − x n i+1 }sgn , for i = 1, . . . , k − 1, with
{⋅}sgn ∶ R Ð→ (−1/2, 1/2]
denoting the signed distance to the nearest integer.
The main point we are conveying here is that, as expected, the correlations are controlled on the scales in which the discrepancy allows us to count points asymptotically.
Lemma B.1 (a) Let D N denote the discrepancy of the sequence (x n )∞
n=1 in [0, 1), and
suppose sup{g > 0 ∶ D N ≪ g N −g for all N} = γ > 0. Let k ⩾ 2 be a natural number, and
let Y be a uniformly distributed random variable modulo 1. Let ε > 0 be suitably small
in terms of γ, and for all N ∈ N and L ∈ R satisfying L ⩽ N, let
W((x n )∞
n=1 , L, N) ∶= ∣{n ⩽ N ∶ x n ∈ [Y , Y + L/N] mod 1}∣.
Then, if L is in the range N 1−γ+ε < L ⩽ N, we have
(B.1)
k
k
−ε/2
EW((x n )∞
)).
n=1 , L, N) = L (1 + O ε (kN
Furthermore, for all continuous functions g ∶ R k−1 Ð→ [0, 1] and for all L in the range
N 1−γ+ε < L < N 1−ε ,
(B.2)
k−1
R k ((x n )∞
∫ g(w) dw,
n=1 , L, N , g) = (1 + o g,ε,k (1))L
as N → ∞, where the error term is independent of the choice of parameters L.
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
33
(b) If (a n )∞
n=1 is a strictly increasing sequence of positive integers, for almost every
α ∈ [0, 1], for all N ∈ N and for all L ∈ R in the range N 2 +ε < L ⩽ N, the sequence
1
satisfies estimates (B.1) and (B.2).
(αa n mod 1)∞
n=1
Remark: Part (b) of the lemma proves our earlier assertion (1.5).
Proof Fix a small ε > 0 throughout this proof. For part (a), the proof of (B.1) is trivial.
Indeed, by the discrepancy estimate, we have
∣{n ⩽ N ∶ x n ∈ [Y , Y + L/N] mod 1}∣ = L + O(N D N )
= L + O ε (N 1−γ+ε/2 ) = L(1 + O ε (N −ε/2 )).
Raising to the kth power and averaging over Y, we obtain (B.1).
To prove the correlation estimate (B.2), by approximating the function g by step
functions, we see that it is enough to prove it in the case when g is the indicator
function of a box
[s 1 , t 1 ] × ⋅ ⋅ ⋅ × [s k−1 , t k−1 ].
We may assume, without loss of generality, that N is large enough, so that (t i −
s i )L/N ⩽ 1 for all i ⩽ k − 1. (This is why it is important for the correlation estimate
to preclude the case L = N.) Then, fixing n k , we see that R k ((x n )∞
n=1 , L, N , g) counts
the number of x n k−1 ≠ x n k such that x n k−1 ∈ [s k−1 L/N + x n k , t k−1 L/N + x n k ] mod 1,
times the number of x n k−2 ≠ x n k−1 , x n k such that x n k−2 ∈ [s k−2 L/N + x n k−1 , t k−2 L/N +
x n k−1 ] mod 1, and so forth.
By the discrepancy estimate, the total number of choices is
((t k−1 − s k−1 )L + O ε (LN −ε/2 )) × ((t k−2 − s k−2 )L + O ε (LN −ε/2 ))
× ⋅ ⋅ ⋅ × ((t 1 − s 1 )L + O ε (LN −ε/2 )).
Summing over all n k and then normalizing by 1/N, we have
k−1
k−1
(1 + O ε, g (kN −ε/2 )),
R k ((x n )∞
n=1 , L, N , g) = (∏(t i − s i )) L
i=1
as desired.
This proves part (a) of the assertion. The remaining part follows by recalling that a
classical (and far more general) result of Erdős and Koksma [9, Theorem 2] furnishes
−1/2+ε
an upper bound on the discrepancy of (αa n mod 1)∞
, for each
n=1 of the quality N
fixed ε > 0 and for almost every α ∈ [0, 1].
∎
Steinerberger [27] raised the question of whether “most sequences”6 have uniform
pair correlations at some scale 0 < β < 1. The above part (b) answers Steinerberger’s
question in a strong sense. Furthermore, it seems worthwhile to record the following
6
The precise meaning of the word “most” was left open for interpretation by Steinerberger, and was
already put in quotation marks in the original paper.
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
34
N. Technau and A. Walker
consequence. To state it, recall that for any irrational α ∈ [0, 1], there is a unique
sequence of positive integers α n ≥ 1 such that
1
α = lim
,
1
n→∞
α1 +
1
α2 +
⋱+
1
αn
where α n is called the nth partial quotient of α. Writing the fraction
1
=∶
1
α1 +
α2 +
pn
qn
1
⋱+
1
αn
in lowest terms, we call q n a convergent denominator of α.
Corollary B.2 Let α n be the nth partial quotient of the irrational number α ∈ [0, 1].
Given N ⩾ 1, let i(N) be such that the convergent denominator q i(N) of α satisfies
q i(N) ⩽ N < q i(N)+1 . If, for each ε > 0, we have
A N (α) = ∑ α j ≪ N ε ,
j⩽i(N)
(αn)∞
n=1
then the Kronecker sequence
satisfies (B.2), for any k ⩾ 2 and for any scale L
such that N δ < L < N for any fixed δ ∈ (0, 1) (the o(1) term in (B.2) then also depends
on δ).
Proof As D N ((αn mod 1)∞
n=1 ) ≪ ε A N (α) (cf. [15, equation (3.18)]), Lemma B.1
completes the proof.
∎
It is well known (also with a higher-dimensional generalization due to Beck [3])
that, for almost every α ∈ [0, 1], the discrepancy of the Kronecker sequence is ≪
(log N)1+ε , for each ε > 0. Thus, the above corollary generalizes and sharpens a result
of Weiı and Skill [28], stating that (ϕn)∞
Poissonian pair correlations on each
n=1 has √
. Furthermore, the condition on
scale β < 1, where ϕ denotes the Golden ratio 5+1
2
A N is known to be true for algebraic α, due to Roth’s famous approximation theorem.
Finally, we remark that, if one so wished, one could readily replace the N ε -terms
in Corollary B.2 by appropriate powers of logarithms.
Appendix C The Rudnick–Sarnak obstruction
In this final appendix, we detail an obstruction to studying the higher-order correlation function R k (α, L, N , g). This obstruction is a generalization of a fundamental
observation of Rudnick and Sarnak [23, Section 4], which those authors made in the
context of constant L and for the triple correlation function of (αn 2 mod 1)∞
n=1 . We
address the more general situation of sequences of the shape (αn d mod 1)∞
n=1 , where
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
35
d ⩾ 2 is a fixed integer. We are also interested in identifying the full range of L in which
the obstruction persists.
To this end, for a compactly supported smooth test function g ∶ R k−1 → R, we
define the correlation function R dk (α, L, N , g) by
R dk (α, L, N , g) ∶=
1
N
∑
1⩽x 1 ,...,x k ⩽N
distinct
g(
N
N
d
− x kd )}sgn ) ,
{α(x 1d − x 2d )}sgn , . . . , {α(x k−1
L
L
where
{⋅}sgn ∶ R Ð→ (−1/2, 1/2]
denotes the signed distance to the nearest integer. Rudnick and Sarnak’s approach to
pair correlations, like in Appendix A, involves showing that
1
d
∫ (R 2 (α, L, N , g) − L
0
2
N −1
̂
g (0)) dα = o g (L 2 ),
N
as N → ∞. Therefore, in order to make a similar approach work for the higher k-point
correlation functions, one would need
1
d
k−1
∫ (R k (α, L, N , g) − L
(C.1)
0
(N) k
̂
g (0)) dα = o d ,k, g (L 2(k−1) ),
Nk
2
as N → ∞, where 0 is the zero vector in R k−1 and (N) k = N(N − 1) . . . (N − k + 1)
abbreviates the kth falling factorial. Our purpose here is to show that, for certain ranges
of d, k, and L, equation (C.1) cannot hold.
Indeed, by applying the Poisson summation formula, one may expand
R dk (α, L, N , g) into a Fourier series
d
(L, N , g)e(ℓα),
R dk (α, L, N , g) = ∑ c k,ℓ
ℓ∈Z
d
d
with certain Fourier coefficients c k,ℓ
(L, N , g). One may compute c k,ℓ
(L, N , g) explick−1
d
itly. For a given vector a ∈ Z , let S a,k,ℓ (N) denote the set of integer vectors x ∈ N k ,
with distinct components 1 ⩽ x i ⩽ N, satisfying the Diophantine equation
d
− x kd ) = ℓ.
a 1 (x 1d − x 2d ) + ⋯ + a k−1 (x k−1
d
One readily verifies that c k,ℓ
(L, N , g) is of the special form
d
(L, N , g) =
c k,ℓ
(C.2)
L k−1
L
d
g ( a) .
∑ ∣S a,k,ℓ (N)∣ ̂
k
N a∈Zk−1
N
From Parseval, we conclude that
∫
1
0
(R dk (α, L, N , g) − L k−1
(N) k
d
̂
(L, N , g)∣2 .
g (0)) dα = ∑ ∣c k,ℓ
Nk
ℓ≠0
2
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
36
N. Technau and A. Walker
d
Now, let ρ = N d+1+ε /L. We observe that if ∣ℓ∣ ⩾ ρ and S a,k,ℓ
(N) ≠ 0, then ∥a∥∞ ⩾
N /L(k − 1). Therefore, from the rapid decay of ̂
g and the formula (C.2), we
d
conclude that ∣c k,ℓ
(L, N , g)∣ = O ε, g,k,K (N −K ) for such ℓ. Hence,
1+ε
d
2
−K
d
2
∑ ∣c k,ℓ (L, N , g)∣ = ∑ ∣c k,ℓ (L, N , g)∣ + O ε, g,k,K (N ).
ℓ≠0
0<∣ℓ∣⩽ρ
To estimate the right-hand side, we note that, by the Cauchy–Schwarz inequality,
2
d
d
(L, N , g)∣ .
(L, N , g)∣2 ⩾ ∣ ∑ c k,ℓ
ρ ∑ ∣c k,ℓ
0<∣ℓ∣⩽ρ
0<∣ℓ∣⩽ρ
The sum inside the absolute value on the right-hand side equals, up to a term of size
O ε, g,k,K (N −K ), the quantity
∑ c k,ℓ (L, N , g)e(0) − L
d
ℓ∈Z
k−1 (N) k
Nk
̂
g (0),
which is
R dk (0, L, N , g) + O g (L k−1 ).
Assuming that g(0) ≠ 0 and L/N = o(1) as N → ∞, this is equal to
N k−1 (1 + o(1))g(0),
as N → ∞. Now, by combining these considerations, we conclude that
1
d
k−1
∫ (R k (α, L, N , g) − L
0
≫ε, g,k
(N) k
̂
g (0)) dα
Nk
2
1 2(k−1)
N
∣g(0)∣2 = LN 2k−d−3−ε ∣g(0)∣2 .
ρ
If (C.1) is to hold for all functions g, for each ε > 0, we must have
L ≫ g,ε,k N
2k−d−3−ε
2k−3
= N 1− 2k−3 .
d+ε
In particular, for k = 3 and d = 2, the convergence (C.1) fails unless L ≫ N 6−3 =
1−ε
N 3 . This justifies the statements we made in the introduction to the effect that
the Rudnick–Sarnak obstruction for triple correlations extends throughout the range
L < N 1/3 .
We also note, however, that as soon as
6−5−ε
d > 2k − 3,
there is no such obstruction (for L constant in terms of N).
The failure of (C.1) is an artefact of the integrand having a large spike when α ≈ 0
(and more generally, the integrand has large spikes when α is very well approximated
by rationals with small denominators). One wonders whether L 2 -convergence (C.1)
can be recovered by restricting to the “minor arcs,” but this also appears to be a difficult
problem.
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
On the triple correlations of fractional parts of n 2 α
37
Acknowledgment We would like to thank Zeev Rudnick and Christoph Aistleitner
for helpful comments relating to previous versions of the manuscript, and Andrew
Granville and Dimitris Koukoulopoulos for many interesting conversations. We
would also like to thank several anonymous referees for their suggestions and corrections.
References
[1] C. Aistleitner and G. Larcher, Metric results on the discrepancy of sequences (a n α)n≥1 modulo one
for integer sequences (a n )n≥1 of polynomial growth. Mathematika 62(2016), no. 2, 478–491.
[2] C. Aistleitner, G. Larcher, and M. Lewko, Additive energy and the Hausdorff dimension of the
exceptional set in metric pair correlation problems. Israel J. Math. 222(2017), no. 1, 463–485.
[3] J. Beck, Probabilistic Diophantine approximation, I. Kronecker sequences. Ann. of Math. (2)
140(1994), no. 2, 449–502.
[4] M. V. Berry and M. Tabor, Level clustering in the regular spectrum. Proc. Roy. Soc. Lond. A Math.
Phy. Sci. 356(1977), no. 1686, 375–394.
[5] F. Boca and A. Zaharescu, Pair correlation of values of rational functions (mod p). Duke Math. J.
105(2000), no. 2, 267–307.
[6] K.-L. Chung, An estimate concerning the Kolmogoroff limit distribution. Trans. Amer. Math. Soc.
67(1949), no. 1, 36–50.
[7] D. √
El-Baz, J. Marklof, and I. Vinogradov, The two-point correlation function of the fractional parts
of n is Poisson. Proc. Amer. Math. √
Soc. 143(2015), no. 7, 2815–2828.
[8] N. Elkies and C. McMullen, Gaps in n mod 1 and ergodic theory. Duke Math. J. 123(2004), no. 1,
95–139.
[9] P. Erdős and J. Koksma, On the uniform distribution modulo 1 of sequences (f(n, ϑ)). Nederl.
Akad. Wetensch. Proc. 52(1949), 851–854.
[10] A. Eskin, G. Margulis, and S. Mozes, Quadratic forms of signature (2, 2) and eigenvalue spacings
on rectangular 2-tori. Ann. of Math. (2) 161(2005), no. 2, 679–725.
[11] G. Harman, Metric number theory. London Mathematical Society Monographs New Series, 18,
Clarendon Press, Oxford, UK, 1998.
[12] D. R. Heath-Brown, Pair correlation for fractional parts of αn2 . Math. Proc. Cambridge Philos.
Soc. 148(2010), no. 3, 385–407.
[13] H. Iwaniec and E. Kowalski, Analytic number theory. Vol. 53, American Mathematical Society,
Providence, RI, 2004.
[14] A. Khintchine, Über einen Satz der Wahrscheinlichkeitsrechnung. Fund. Math. 6(1924), no. 1,
9–20.
[15] L. Kuipers and H. Niederreiter, Uniform distribution of sequences. Courier Corporation, North
Chelmsford, MA, 2012.
[16] C. Lutsko, Long-range correlations of sequence modulo 1. Preprint, 2020. arXiv:2007.09292
[17] J. Marklof, The Berry–Tabor conjecture. In: C. Casacuberta, R. M. Miró-Roig, J. Verdera, and S.
Xambó-Descamps (eds.), European Congress of Mathematics, Springer, New York, 2001,
pp. 421–427.
[18] J. Marklof, Pair correlation densities of inhomogeneous quadratic forms. Ann. of Math. (2)
158(2003), no. 2, 419–471.
[19] H. Montgomery, Ten lectures on the interface between analytic number theory and harmonic
analysis. CBMS Regional Conference Series in Mathematics, 84, American Mathematical Society,
Providence, RI, 1994.
[20] Z. Rudnick, Quantum chaos? Notices Amer. Math. Soc. 55(2008), no. 1, 32–34.
[21] Z. Rudnick, A metric theory of minimal gaps. Mathematika 64(2018), no. 3, 628–636.
[22] Z. Rudnick and P. Kurlberg, The distribution of spacings between quadratic residues. Duke J.
Math. 100(1999), 211–242.
[23] Z. Rudnick and P. Sarnak, The pair correlation function of fractional parts of polynomials. Comm.
Math. Phys. 194(1998), no. 1, 61–70.
[24] Z. Rudnick, P. Sarnak, and A. Zaharescu, The distribution of spacings between the fractional parts
of n2 α. Invent. Math. 145(2001), no. 1, 37–57.
[25] Z. Rudnick and A. Zaharescu, The distribution of spacings between fractional parts of lacunary
sequences. Forum Math. 14(2002), no. 5, 691–712.
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
38
N. Technau and A. Walker
[26] P. Sarnak, Values at integers of binary quadratic forms. In: Harmonic analysis and number theory
(Montreal, PQ, 1996), CMS Conference Proceedings, 21, American Mathematical Society,
Providence, RI, 1997, 181–203.
[27] S. Steinerberger, Poissonian pair correlation and discrepancy. Indag. Math. 29(2018), no. 5,
1167–1178.
[28] C. Weiı and T. Skill, Sequences with almost Poissonian pair correlations. Preprint, 2021.
arXiv:1905.02760
[29] H. Weyl, Über die gleichverteilung von zahlen mod eins. Math. Ann. 77(1916), no. 3, 313–352.
[30] A. Zaharescu, Correlation of fractional parts of n2 α. Forum Math. 15(2003), no. 1, 1–21.
University of Wisconsin-Madison, Madison, USA
e-mail: technau@wisc.edu
Trinity College, Cambridge, UK
e-mail: aw530@cam.ac.uk
Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.
Download