Canad. J. Math. 2021, pp. 1–38 http://dx.doi.org/10.4153/S0008414X21000249 © Canadian Mathematical Society 2021 On the triple correlations of fractional parts of n2 α Niclas Technau and Aled Walker Abstract. For fixed α ∈ [0, 1], consider the set S α,N of dilated squares α, 4α, 9α, . . . , N 2 α modulo 1. Rudnick and Sarnak conjectured that, for Lebesgue, almost all such α the gap-distribution of S α,N is consistent with the Poisson model (in the limit as N tends to infinity). In this paper, we prove a new estimate for the triple correlations associated with this problem, establishing an asymptotic expression for the third moment of the number of elements of S α,N in a random interval of length L/N, provided that L > N 1/4+ε . The threshold of 41 is substantially smaller than the threshold of 21 (which is the threshold that would be given by a naïve discrepancy estimate). Unlike the theory of pair correlations, rather little is known about triple correlations of the ∞ dilations (αa n mod 1)∞ n=1 for a nonlacunary sequence (a n ) n=1 of increasing integers. This is partially due to the fact that the second moment of the triple correlation function is difficult to control, and thus standard techniques involving variance bounds are not applicable. We circumvent this impasse by using an argument inspired by works of Rudnick, Sarnak, and Zaharescu, and Heath-Brown, which connects the triple correlation function to some modular counting problems. In Appendix B, we comment on the relationship between discrepancy and correlation functions, answering a question of Steinerberger. 1 Introduction Let (a n )∞ n=1 be a strictly increasing sequence of positive integers. This paper is concerned with the distribution of (αa n mod 1)∞ n=1 in short intervals, for a generic dilate α ∈ [0, 1], with particular focus on the case when a n = n 2 . We begin with the familiar notion of the discrepancy D N = D N ((αa n mod 1)∞ n=1 ), defined to be (1.1) D N = sup ∣ 0<a<b<1 ∣{n ≤ N ∶ αa n mod 1 ∈ (a, b)}∣ − (b − a)∣ . N It is an old result of Weyl, contained in his 1916 paper [29], that D N ((αn 2 mod 1)∞ n=1 ) = o α (1) as N → ∞ for any irrational α. The sequence (αn 2 mod 1)∞ is then said to n=1 be equidistributed modulo 1. One way of viewing Weyl’s result is as demonstrating a Received by the editors July 21, 2020; revised April 14, 2021; accepted April 19, 2021. Published online on Cambridge Core May 4, 2021. While the work toward this paper was being carried out, NT was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 786758) and by the Austrian Science Fund (FWF): project J-4464. A.W. is supported by a Junior Research Fellowship from Trinity College Cambridge, and was also supported by a Post-Doctoral Fellowship at the Centre de Recherches Mathématiques. AMS subject classification: 11K06, 11K60, 11L05. Keywords: Triple correlations, uniform distribution, discrepancy. Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 2 N. Technau and A. Walker pseudorandomness property for the sequence (αn 2 mod 1)∞ n=1 . Indeed, in the random N is replaced by N independent random variables model in which (αn 2 mod 1)n=1 X 1 , . . . , X N , which are uniformly distributed on [0, 1), one has1 √ 2N D N ((X n )∞ n=1 ) √ lim sup =1 log log N N→∞ almost surely. In particular ED N ((X n )∞ n=1 ) = o(1) as N → ∞. One might wonder, considering the strong quantitative decay enjoyed in the random model, whether Weyl’s result admits such a quantitative refinement. Unfortunately, as is well known, if α is well approximated by rational numbers with small denominators, then the discrepancy D N ((αn 2 mod 1)∞ n=1 ) can tend to zero extremely slowly. However, for a generic α, the situation is much improved, and in fact, for any strictly increasing sequence of positive integers (a n )∞ n=1 , one has the classical result of Erdős and Koksma [9], which implies that − 2 +ε D N ((αa n mod 1)∞ ) n=1 ) = O ε (N 1 for Lebesgue almost all α ∈ [0, 1]. This trivially implies that, for almost all α ∈ [0, 1], if 1 I ⊂ R/Z is a fixed interval of length ∣I∣ ⩾ N − 2 +ε , then (1.2) ∣{n ⩽ N ∶ αn 2 mod 1 ∈ I}∣ = (1 + o ε (1))N∣I∣. So, at least for a generic α, pseudorandomness is enjoyed down to the scale N −1/2+ε . A result of Aistleitner and Larcher [1, Corollary 1] implies that the exponent 21 is optimal for the metric discrepancy problem: for generic α and for each ε > 0, there −1/2−ε are infinitely many N such that D N ((αa n mod 1)∞ , provided a n = P(n) n=1 ) > N for some polynomial P, of degree at least two, with integer coefficients. By allowing the interval I to vary, one can develop a related notion of pseudorandomness at scales that are smaller than N −1/2 , one which concerns the “clustering” of the points αn 2 mod 1. To introduce this notion, which will be the main focus of the paper, we let Y be a random variable that is uniformly distributed on [0, 1), and for a natural number N and a parameter L in the range 0 < L ⩽ N, we let Wα,L,N be the random variable (1.3) Wα,L,N ∶= ∣{n ⩽ N ∶ αn 2 ∈ [Y , Y + L/N] mod 1}∣. It is easy to see that EWα,L,N = L. But how should one expect Wα,L,N to be distributed in the limit N → ∞ (for a generic dilate α)? Consider the same random model as N before, in which (αn 2 mod 1)n=1 is replaced by N independent random variables X 1 , . . . , X N , which are uniformly distributed on [0, 1). Then, if L is constant as N → ∞, letting (1.4) Z L,N ∶= ∣{n ⩽ N ∶ X n ∈ [Y , Y + L/N] mod 1}∣, 1 This is by the law of the iterated logarithm. See Khintchine [14], as well as Chung [6, Theorem 2∗ ] and Kuipers and Niederreiter [15, p. 98]. Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 3 one may calculate that d i st Z L,N ÐÐ→ Po(L) as N → ∞, where Po(L) is a Poisson-distributed random variable with parameter L. Having described this random model, we can now state the following remarkable conjecture. Conjecture 1.1 (Rudnick and Sarnak2 [23].) L > 0, For almost all α ∈ [0, 1], for all fixed d i st Wα,L,N ÐÐ→ Po(L), as N → ∞. If true, this conjecture would represent a strong local notion of pseudorandomness for the sequence (αn 2 mod 1)∞ n=1 , at least for a generic α. In fact, a further conjecture [24, p. 38] posits more information about the full-measure set of suitable dilates α. To state this conjecture, we recall that α is of type ω if there are only finitely many pairs (a, q) with ∣α − a/q∣ < q−ω . Conjecture 1.2 (Rudnick, Sarnak, and Zaharescu.) and the convergents a/q to α satisfy If α is of type 2 + ε for all ε > 0, lim log q̃/ log q = 1, q→∞ where q̃ is the square-free part of q, then, for all fixed L > 0, as N → ∞, one has d i st Wα,L,N ÐÐ→ Po(L). Remark: We remind the reader that, by Dirichlet’s approximation theorem, the type of a real number is never less than 2. Furthermore, by Khintchine’s theorem, a generic number √ is of type 2 + ε for all ε > 0. One can also readily find explicit examples, like α = 2, by using continued fractions. Conjectures 1.1 and 1.2 appear to lie very deep. One hypothetical approach for showing the desired convergence in distribution would be to use the method of moments. More precisely, if one could show, for a generic α, for a random variable k X L ∼ Po(L), and for all k ∈ N, that E Wα,L,N → E X Lk as N → ∞, then Conjecture 1.1 would follow. Rudnick and Zaharescu [25] used this approach to show that if (a n )∞ n=1 is a lacunary sequence of natural numbers, then, for almost all α, the sequence (αa n mod 1)∞ n=1 has spacing statistics that agree with the Poisson model. For the squares, the first nontrivial case is k = 2, and this was settled by Rudnick and Sarnak3 some 22 years ago. 2 These authors refer to the “distribution of the spacings between the elements,” rather than mentioning the random variables Wα ,L,N directly, but these are equivalent notions. For more on this other perspective, see the introduction to [24]. 3 We should remark that Rudnick and Sarnak phrased their result in terms of the pair correlation function, rather than explicitly mentioning Wα ,L,N , but the results are equivalent. Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 4 N. Technau and A. Walker Theorem 1.3 (Rudnick and Sarnak [23]) For almost all α ∈ [0, 1], for all fixed L > 0, 2 = L + L 2 + o α,L (1), EWα,L,N as N → ∞. This gives an estimate on the so-called number variance Var(Wα,L,N ), namely Var(Wα,L,N ) = L + o α,L (1), as N → ∞. Very little is known regarding the higher moments, although certain results can be extracted from the literature. For larger L, the issue is settled by the aforementioned discrepancy bounds. Indeed, expression (1.2) implies that, for almost all α ∈ [0, 1], for all integers k ⩾ 1, for all N ∈ N, and for all L ∈ R in the range N 1/2+ε ⩽ L ⩽ N, k EWα,L,N = L k (1 + o α,ε,k (1)), (1.5) as N → ∞ (where the error term is independent of the choice of parameters L). We will give the simple proof of (1.5) in Appendix B, alongside other consequences of the discrepancy bounds. In passing, we will answer a question of Steinerberger from [27]. One might wonder whether the methods that Rudnick and Sarnak introduced to tackle the second moment in Theorem 1.3 could be applied to higher moments. Unfortunately, in the case when L is constant, Rudnick and Sarnak already noted in [23, Section 4] that their method faces a major obstruction when applied to higher moments. We will describe this obstruction in Appendix C, where (for the third moment) we observe that the obstruction persists for L as large as N 1/3 . We now present the main result of this paper. Theorem 1.4 (Main Theorem) Let ε > 0. Then, for almost all α ∈ [0, 1], for all N ∈ N, and for all L ∈ R in the range N 1/4+ε < L ⩽ N, we have 3 = L 3 (1 + o α,ε (1)), EWα,L,N as N → ∞, where the o α,ε (1) term is independent of the choice of the parameters L. Note, in particular, that 41 < 31 , so we successfully give an asymptotic expression for the third moment in part of the range in which the Rudnick–Sarnak obstruction holds (see Appendix C). To prove Theorem 1.4, we will first perform a standard reduction to a statement concerning correlation functions. These functions are closely related to the moments k EWα,L,N , but they can be more convenient to analyze. Definition 1.1 (Correlation functions) Let α ∈ [0, 1], let k ⩾ 2 be a natural number, and let g ∶ R k−1 → [0, 1] be a compactly supported function. Then, the kth correlation function R k (α, L, N , g) is defined to be 1 N ∑ 1⩽x 1 ,...,x k ⩽N distinct g( N N N 2 {α(x 12 − x 22 )}sgn , {α(x 22 − x 32 )}sgn , . . . , {α(x k−1 − x k2 )}sgn ) , L L L Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 5 where {⋅}sgn ∶ R Ð→ (−1/2, 1/2] denotes the signed distance to the nearest integer, and N and L are real parameters. In practice, one only needs to understand the correlation functions when the test function g is sufficiently nice, for example, smooth or the indicator function of a box. When proving Theorem 1.3, Rudnick and Sarnak manipulated the pair correlation function R 2 (α, L, N , g). We manipulate the triple correlation function R 3 (α, L, N , g), proving the following result, which, for readers more familiar with correlation functions than with the moments of Wα,L,N , might seem to be more natural. Theorem 1.5 (Triple correlations) Let ε > 0. Then, for almost all α ∈ [0, 1], for all compactly supported continuous functions g ∶ R2 Ð→ [0, 1], for all N ∈ N, and for all L ∈ R in the range N 1/4+ε < L < N 1−ε , we have R 3 (α, L, N , g) = (1 + o α,ε, g (1))L 2 (∫ g(w) dw) , as N → ∞, where the o α,ε, g (1) term is independent of the choice of the parameters L. Before continuing to survey other relevant papers, we should stop to explain why the spacing statistics of the sequence αn 2 mod 1 are of a particular interest (aside from as a part of the larger endeavor of finding pseudorandomness in arithmetic sequences). This is due to a connection between number theory and theoretical physics known as arithmetic quantum chaos. In brief, the spacing statistics between elements of the sequence of αn 2 mod 1 correspond to the spacing statistics between the eigenvalues of a certain quantum system. (This system is a two-dimensional boxed oscillator, with a harmonic potential in one direction and hard walls in the other, as described in the introduction to [23].) A famous and far-reaching observation of Berry and Tabor [4] then suggests that such spacing statistics in the semiclassical limit (i.e., the distribution of Wα,L,N for constant L as N → ∞) should be determined by the dynamics of the corresponding classical system. Regular (integrable) classical dynamics should correspond to spacing statistics in asymptotic agreement with the Poisson model. Unfortunately, if α is rational (or is very well approximated by rationals with square denominators), it is easy to prove that high moments of Wα,L,N do not agree with the Poisson model (see [24, Theorem 2])! However, excluding such zero-measure counterexamples,4 one might still hope for a metric result. There are precious few examples of fixed sequences for √ which full information is known about the spacing statistics. For the sequence ( n mod 1)∞ n=1 , Elkies and McMullen [8] have established, using dynamical methods, the gap distribution of √ ( n mod 1)∞ n=1 (which is not from a Poisson model); El-Baz, Marklof, and Vinogradov [7] have demonstrated that the second moment of this gap distribution is nonetheless in accordance with the Poisson model. Furthermore, there are the results in [10] and [18] on the second moment of the gap distribution between values of quadratic forms. 4 Zaharescu [30] showed, in a precise sense, that, at least among all very well approximable α, these were the only counterexamples. Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 6 N. Technau and A. Walker However, most of the results in the literature are metric in at least one parameter (e.g., [2, 23, 25, 26]), and such results still have substantial content. To delve further into the relationship to theoretical physics and to other, spacing statistics would be to digress too far from our main theme; we direct the interested reader to the articles of Marklof [17] and Rudnick [20] for more on these issues. Returning to the study of the moments of Wα,L,N , and the discussion of relevant work, we continue with the paper [24]. Here, Rudnick, Sarnak, and Zaharescu develop tools to relate the Diophantine approximation properties of α to the size of the k moments EWα,L,N . The main result of that paper can be phrased as follows. Theorem 1.6 (Rudnick, Sarnak, and Zaharescu) Let α ∈ [0, 1], and suppose that there are infinitely many rationals b j /q j , with q j prime, satisfying (1.6) ∣α − bj 1 ∣< 3. qj qj Then, there is a subsequence N j → ∞, with log N j / log q j → 1 for which, for all L > 0, d i st Wα,L,N j ÐÐ→ Po(L), as j → ∞. The authors of [24] sacrificed the genericness of α (working instead with those α which are unusually well-approximable) in favor of control over the moments k EWα,L,N for constant L and for all k. One may switch objectives in their analysis, j sacrificing the range of L in order to work with almost all α. Applying this switch in the context of the third moment calculation, their method shows the following result. Theorem 1.7 (Method of R–S–Z) Let ε > 0. Then, for almost all α ∈ [0, 1], for all N ∈ N, and for all L ∈ R in the range N 3/5+ε < L ⩽ N, we have 3 EWα,L,N = L 3 (1 + o α,ε (1)), as N → ∞, where the o α,ε (1) term is independent of the choice of parameters L. Moreover, one can give an explicit description of a suitable full-measure set of suitable α (in terms of properties of rational approximations to α). Although the range of L in Theorem 1.7 is much smaller than the range in our result, the method of R–S–Z yields a more explicit description of the full-measure set of suitable α. We will indicate how to extract Theorem 1.7 from [24] in Section 3 below. Our approach to proving Theorem 1.5 is inspired by this work of R–S–Z, but also by the work of Heath-Brown in [12], who introduced a related technique for studying the pair correlation function R 2 (α, L, N , g). The full description of the method will come in Section 3, but we sketch the idea here, so as to explain in a rough way how we extract an improvement over [24]. After having replaced α by a rational approximation a/q, one transforms the triple correlation function into an expression that counts the number of solutions to certain polynomial equations modulo q. The equations which occur are of the form (1.7) {x, y, z ⩽ N ∶ x 2 − y 2 ≡ c 1 (mod q), y 2 − z 2 ≡ c 2 (mod q)}, Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 7 for certain ranges of N and q and for certain sets of coefficients c 1 and c 2 . In [24], the number of solutions was estimated by using the “completion of sums” technique to remove the N cutoff, followed by an implementation of the Hasse–Weil bound. In our work, we manage to take advantage of the extra averaging over c 1 and c 2 , which is present in the problem, together with some intricate (although elementary) exponential sum arguments, which ends up leading to a stronger bound for certain ranges of N and q. Heath-Brown did something similar for the pair correlation function [12], but the analysis of the relevant exponential sums for the triple correlations is substantially more delicate. Another relevant work is the paper of Rudnick and Kurlberg [22], in which those authors established that the spacing of quadratic residues mod q as the number of prime factors of q grows is in agreement with the Poisson model. Lemma 3.3 below could be viewed as a special case of the arguments of that paper. However, as will become evident, the work here necessarily concerns a rather sparse subset of the set of quadratic residues mod q, as N ≈ q θ with θ < 1, and so the work of [22] is not directly applicable. Pair correlations of rational functions mod q were also studied by Boca and Zaharescu [5], but again, we will not be able to use that paper directly. Very recently, and after the first version of the present manuscript was submitted, Lutsko released a preprint [16] which generalized Theorem 1.5 to all long-range k−2 correlation functions R k (α, L, N , g), provided L ⩾ N 2k−2 +ε . When k = 3, the threshold k−2 k−2 recovers our threshold of N 1/4 for triple correlations. Note also that 2k−2 → 21 2k−2 from below as k → ∞, that is, Lutsko’s threshold approaches the trivial discrepancy bound for large k. The methods of [16] seem to be completely different to our own, and are instead based on an analysis of the sums ∑n⩽N e(αmn 2 ) using the van der Corput B process (and subsequent stationary phase estimates). It remains to be seen whether any improvement to the threshold c 3 = 41 could be derived by combining these two different approaches. The structure of the paper is as follows. In Section 2, we will give the standard argument (passing from moments to correlation functions), which reduces Theorem 1.4 to Theorem 1.5. The proof of Theorem 1.5 is then given in Section 3, in which it is resolved subject to three auxiliary results (one concerning Diophantine approximation, and the other two concerning the number of solutions to certain Diophantine equations similar to (1.7)). The final three sections of the paper settle these results— one per section—thus concluding the main proof. Appendices A–C contain some arguments that are minor modifications of the literature (but which are nonetheless pertinent to the main paper). These are, respectively, a version of Theorem 1.3 in which L grows with N, the proof of the asymptotic (1.5), and the discussion of the obstruction to the study of triple correlations that was identified by Rudnick and Sarnak. 1.1 Notation Most of our notation is standard, but perhaps we should highlight a few conventions. For a natural number q, we let e q (x) be a shorthand for e 2π i x/q , and given a parameter M ⩾ 1, we let [M] denote the set {m ∈ N ∶ 1 ⩽ m ⩽ M}. In particular, 1[M] denotes the indicator function of all the natural numbers at most M. If a range of summation is Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 8 N. Technau and A. Walker given as ∑m⩽M , then it is assumed that m is a natural number and that m ⩾ 1. Finally, for x ∈ R, we will use ∥x∥ to denote the distance from x to the nearest integer, and {x}sgn to denote the signed distance to the nearest integer from x. 2 Reduction to correlation functions In this short section, we will reduce Theorem 1.4 to Theorem 1.5. First, because estimate (1.5) holds for large L, we may assume, without loss of generality, that L < N 1−ε . Then, we use linearity of expectation to deduce that 3 = EWα,L,N 2 2 2 ∑ P(αx , α y , αz ∈ [Y , Y + L/N] mod 1) x , y,z⩽N = ∑ P(αx 2 ∈ [Y , Y + L/N] mod 1) x⩽N + 3 ∑ P(αx 2 , α y 2 ∈ [Y , Y + L/N] mod 1) x , y⩽N x≠y (2.1) + ∑ x , y,z,⩽N distinct P(αx 2 , α y 2 , αz 2 ∈ [Y , Y + L/N] mod 1), where Y is a random variable that is uniformly distributed modulo 1. The first of the three terms in (2.1) is equal to L, and so may be absorbed into the error term of Theorem 1.4. The second term is =3 ∑ x , y⩽N x≠y ∥α(x 2 −y 2 )∥⩽L/N ( L − ∥α(x 2 − y 2 )∥) N ⎛ ⎞ N 1 2 2 ⎟ =3L ⎜ ⎜ N ∑ max (0, 1 − L ∥α(x − y )∥)⎟ x , y,⩽N ⎝ ⎠ x≠y =3LR 2 (α, L, N , f ), (2.2) where f ∶ R Ð→ [0, 1] is the function f (x) = max(0, 1 − ∣x∣) and R 2 (α, L, N , f ) is the pair correlation function as defined in Definition 1.1. In Appendix A, we will show, by a trivial adaptation of the known techniques, that, for almost all α ∈ [0, 1], one has R 2 (α, L, N , f ) = (1 + o α, f (1))L (∫ f (x) dx) = (1 + o α, f (1))L. (2.3) Therefore, expression (2.2) is equal to 3L 2 + o α, f (L 2 ), which may also be absorbed into the error term of Theorem 1.4. What remains is the third term of (2.1). This is equal to (2.4) L N ∑ max(0, (1 − x , y,z⩽N distinct N max(∥α(x 2 − y 2 )∥, ∥α(y 2 − z 2 )∥, ∥α(z 2 − x 2 )∥)). L Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 9 Because (x 2 − y 2 ) + (y 2 − z 2 ) = x 2 − z 2 , we see that (2.4) is equal to an expression of the form R 3 (α, L, N , g) for some continuous compactly supported function g ∶ R2 Ð→ [0, 1]. Indeed, ⎧ max(0, 1 − w 1 − w 2 ) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪max(0, 1 − max(w 1 , −w 2 )) g(w 1 , w 2 ) = ⎨ ⎪ max(0, 1 − max(−w 1 , w 2 )) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩max(0, 1 + w 1 + w 2 ) w 1 , w 2 ⩾ 0, w 1 ⩾ 0, w 2 ⩽ 0, w 1 ⩽ 0, w 2 ⩾ 0, w 1 , w 2 ⩽ 0. An elementary calculation then demonstrates that ∞ ∞ ∫ ∫ g(w 1 , w 2 ) dw 1 dw 2 = 1. −∞ −∞ Therefore, by Theorem 1.5, if L > N 4 +ε , then, for almost all α ∈ [0, 1], expression (2.4) ∎ is equal to L 3 (1 + o α,ε (1)) as N → ∞. So Theorem 1.4 is proved. We make the usual remark that, by approximating the continuous function g above and below by step functions, to prove Theorem 1.5, it will be enough to prove the following result. 1 Theorem 2.1 Let ε > 0. Then, for almost all α ∈ [0, 1], for all s 1 , t 1 , s 2 , t 2 ∈ R for which s 1 < t 1 and s 2 < t 2 , and for all L in the range N 1/4+ε < L < N 1−ε , we have R 3 (α, L, N , g s,t ) = (1 + o α,ε,s,t (1))L 2 (t 1 − s 1 )(t 2 − s 2 ), as N → ∞, where s = (s 1 , s 2 ), t = (t 1 , t 2 ), and g s,t is the indicator function of the box [s 1 , t 1 ] × [s 2 , t 2 ]. It will turn out to be crucial in our subsequent methods that the ratio log L/ log N does not vary too wildly. To finish this section, we will show how to deduce Theorem 2.1 from the following weaker result. Theorem 2.2 Let β ∈ (0, 43 ) and let η > 0. Then, if η max(β −1 , ( 43 − β)−1 ) is small enough, the following holds: for almost all α ∈ [0, 1], for all s 1 , t 1 , s 2 , t 2 ∈ R for which s 1 < t 1 and s 2 < t 2 , for all N ∈ N, and for all L ∈ R in the range N 1−β−η < L < N 1−β+η , we have R 3 (α, L, N , g s,t ) = (1 + o α,β,η,s,t (1))L 2 (t 1 − s 1 )(t 2 − s 2 ), as N → ∞, where s = (s 1 , s 2 ), t = (t 1 , t 2 ), and g s,t is the indicator function of the box [s 1 , t 1 ] × [s 2 , t 2 ]. The error term is independent of the exact choice of the parameters L. Proof Let ε > 0 and choose η > 0 such that ηε−1 is suitably small. Let {β 1 , . . . , β R } be a maximal η-separated subset of [ε, 43 − ε]. Then, for each β i , we get a full-measure set Ω i ⊂ [0, 1] such that if α ∈ Ω i , the conclusion of Theorem 2.2 holds with β = β i and with η as chosen. We claim that Ω = ∩ i⩽R Ω i is a suitable full-measure set of values of α in Theorem 2.1. Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 10 N. Technau and A. Walker Indeed, let α ∈ Ω and let s 1 , t 1 , s 2 , t 2 ∈ R with s 1 < t 1 and s 2 < t 2 . From Theorem 2.2, we know that, for all i ⩽ R, for any δ > 0, for all N ⩾ N 0 (α, β i , η, δ, s, t), and for any L in the range N 1−β i −η < L < N 1−β i +η , we have ∣R 3 (α, L, N , g s,t ) − L 2 (t 1 − s 1 )(t 2 − s 2 )∣ < δL 2 . (2.5) Now, given any N and any L in the range N 1/4+ε < L < N 1−ε , there exists an i such that N 1−β i −η < L < N 1−β i +η . Therefore, if N ⩾ max i⩽R N 0 (α, β i , η, δ, s, t), the inequality (2.5) holds. Because δ is arbitrary, the conclusion of Theorem 2.1 holds. ∎ 3 Proof of Theorem 2.2 Our task is now to prove Theorem 2.2. Let us fix β and η, which is assumed to be sufficiently small, and for the time being, let us also fix some α ∈ [0, 1] and some s 1 , t 1 , s 2 , t 2 ∈ R with s 1 < t 1 and s 2 < t 2 . We may also assume, without loss of generality, that N is sufficiently large in terms of α, β, η, s, and t. We begin by replacing α with a suitably good rational approximation a/q. Assume that there exists some rational a/q, with q prime, for which a 1 ∣α − ∣ ⩽ 2−η q q (3.1) and (3.2) N 2+β 2 +10η ⩽ q ⩽ 2N 2+β 2 +10η . We will show in Section 4 that almost all α admit such an approximation. Then, for such a pair (N , q), we define (3.3) A(N , q, c 1 , c 2 ) ∶= ∣{x, y, z ⩽ N ∶ x 2 − y 2 ≡ c 1 (mod q), y 2 − z 2 ≡ c 2 (mod q)}∣. We then claim that (3.4) 1 N ∑ (r 1 ,r 2 )∈S r 1 r 2 ≠0 r 1 +r 2 ≠0 − A(N , q, ar 1 , ar 2 ) ⩽ R 3 (α, L, N , g s,t ) ⩽ 1 N ∑ (r 1 ,r 2 )∈S + r 1 r 2 ≠0 r 1 +r 2 ≠0 A(N , q, ar 1 , ar 2 ), where a denotes the inverse of a modulo q, (3.5) S − = {(r 1 , r 2 ) ∈ Z2 ∶ s i qL N2 t i qL N2 + 1−η ⩽ r i ⩽ − 1−η , i = 1, 2} , N q N q S + = {(r 1 , r 2 ) ∈ Z2 ∶ s i qL N2 t i qL N2 − 1−η ⩽ r i ⩽ + 1−η , i = 1, 2} . N q N q and (3.6) Indeed, given x, y in the range 1 ⩽ x, y ⩽ N, let us consider r 1 ∈ Z to be defined by the relation a(x 2 − y 2 ) ≡ r 1 (mod q) Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 11 and −q/2 < r 1 < q/2. Suppose that s 1 qL N2 t 1 qL N2 + 1−η ⩽ r 1 ⩽ − 1−η . N q N q Then, we have the inequalities a r1 N2 L a {α(x 2 − y 2 )}sgn ⩽ { (x 2 − y 2 )} + ∣(α − ) (x 2 − y 2 )∣ ⩽ + 2−η ⩽ t 1 q q q q N sgn and a a r1 N2 L {α(x 2 − y 2 )}sgn ⩾ { (x 2 − y 2 )} − ∣(α − ) (x 2 − y 2 )∣ ⩾ − 2−η ⩾ s 1 . q q q q N sgn Suppose instead that L L ⩽ {α(x 2 − y 2 )}sgn ⩽ t 1 . N N Then, similarly to the above, we have s1 a a Lq N2 r 1 = q { (x 2 − y 2 )} ⩽ q ({α(x 2 − y 2 )}sgn + ∣(α − ) (x 2 − y 2 )∣) ⩽ t 1 + 1−η q q N q sgn and a Lq N2 a − 1−η . r 1 = q { (x 2 − y 2 )} ⩾ q ({α(x 2 − y 2 )}sgn − ∣(α − ) (x 2 − y 2 )∣) ⩾ s 1 q q N q sgn Finally, take 1 ⩽ z ⩽ N and define r 2 by the relation a(y 2 − z 2 ) ≡ r 2 (mod q) with −q/2 < r 2 < q/2. Then, because N < q/2, we have that x, y, z are distinct if and only if r 1 r 2 ≠ 0 and r 1 + r 2 ≠ 0. From all these observations taken together, claim (3.4) is settled. Remark: The reader might find it helpful to note, at this early stage, that the relative sizes of N and q were chosen, so that the two terms qL/N and N 2 /q 1−η which appear in (3.5) and (3.6) are of approximately the same magnitude, namely q(2−β)/(2+β) . A substantial portion of this paper will involve estimating the quantity A(N , q, c 1 , c 2 ), on average over c 1 and c 2 . To this end, we let A 0 (q, c 1 , c 2 ) = ∣{x, y, z ⩽ q ∶ x 2 − y 2 ≡ c 1 (mod q), y 2 − z 2 ≡ c 2 (mod q)}∣ be the number of solutions to the key congruences, in which the variables x, y, z may range over the entire field Z/qZ. One might reasonably expect that A(M, q, c 1 , c 2 ) ≈ (M/q)3 A 0 (q, c 1 , c 2 ) as long as M is large enough, and so we introduce the difference 3 RR RR RR RR M (3.7) ∆(M, q, c 1 , c 2 ) ∶= RRRRA(M, q, c 1 , c 2 ) − ( ) A 0 (q, c 1 , c 2 )RRRR . q RRR RRR Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 12 N. Technau and A. Walker The following key technical lemma will be proved in Section 5. Lemma 3.1 If q is an odd prime and M < q, then (3.8) 2 3 3 6 2 ∑ ∆(M, q, c 1 , c 2 ) ≪ (log q) M + (log q) q , c 1 ,c 2 ⩽q as q → ∞. Lemma 3.1 can be used, in turn, in Section 6 to prove the following lemma on the average size of the error terms ∆(N , q, ar 1 , ar 2 ), which is helpful for analyzing the relation (3.4). Lemma 3.2 Let β ∈ (0, 43 ) and let η > 0. Then, if η max(β −1 , ( 43 − β)−1 ) is small enough, the following holds: for almost all α ∈ [0, 1], for all s 1 , t 1 , s 2 , t 2 ∈ R for which s 1 < t 1 and s 2 < t 2 , for all N ∈ N, and for all L ∈ R in the range N 1−β−η < L < N 1−β+η , and for all prime q and a/q satisfying (3.1) and (3.2), we have (3.9) 1 N ∑ (r 1 ,r 2 )∈S r 1 r 2 ≠0 r 1 +r 2 ≠0 + ∆(N , q, ar 1 , ar 2 ) ≪α,β,η,s,t L 2 q−η , where S + is as defined in (3.6). At this point, it is worth us taking a small diversion from the main proof to discuss the numerology in Lemma 3.1, and why the bound is close to best possible. We begin with an easy lemma concerning A 0 (q, c 1 , c 2 ) itself. A more sophisticated version of this lemma was worked out by Rudnick and Kurlberg in [22, Proposition 4], but to make our exposition self-contained, we decided to include a direct proof for the triple correlation case. Lemma 3.3 Let q be an odd prime. If c 1 , c 2 are not both divisible by q, then √ ⎧ ⎪ q + O( q) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪2q − 1 − ( −c 2 ) ⎪ ⎪ q A 0 (q, c 1 , c 2 ) = ⎨ c1 ⎪ 2q − 1 − ( q ) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪2q − 1 − ( c 2 ) ⎪ q ⎩ if c 1 + c 2 ≡/ 0 (mod q), if c 1 ≡ 0 (mod q), if c 2 ≡ 0 (mod q), if c 1 + c 2 ≡ 0 (mod q), where ( q⋅ ) denotes the Legendre symbol modulo q. Furthermore, A 0 (q, 0, 0) = 4q − 3. Remark: For this paper, it would have been enough to consider only the nondegenerate case of Lemma 3.3. However, the arguments of Section 5 are cleaner if we allow ourselves to include the degenerate cases, which is the reason why we do so. Proof In the trivial case c 1 ≡ c 2 ≡ 0 (mod q), we note that y = ±x (mod q) and z = ±y (mod q). So, except for when x = q, there are four choices for y, z given a fixed x, thus showing that A q (0, 0) = 4(q − 1) + 1 = 4q − 3 as claimed. Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 13 Thus, we assume in the following that at least one of c 1 , c 2 is not divisible by q. We have that 1 e q ( j(x 2 − y 2 − c 1 ) + k(y 2 − z 2 − c 2 )). A 0 (q, c 1 , c 2 ) = 2 ∑ q 0⩽ j,k⩽q−1 0⩽x , y,z⩽q−1 The terms when j = 0, k = 0, or j = k contribute ⎛ ⎞ 1 ⎜ 2 3⎟ 2 2 ⎜q 1 − 2q ⎟ , 1+q 1+q ∑ ∑ ∑ ⎟ q2 ⎜ 0⩽x , y⩽q−1 0⩽y,z⩽q−1 0⩽x ,z⩽q−1 ⎝ y 2 ≡x 2 −c 1 (mod q) ⎠ y 2 ≡z 2 +c 2 (mod q) x 2 =z 2 +c 1 +c 2 where the final term is used to correct for the overcounting of ( j, k) = (0, 0). By factorizing the ranges of summation using the difference of two squares, this expression is equal to q + (q − 1)(1q∣c 1 + 1q∣c 2 + 1q∣(c 1 +c 2 ) ). Recall the Gauss sum evaluation j √ e q ( jx 2 ) = ( ) ε q q, q 0⩽x⩽q−1 ∑ (3.10) when q ∤ j and where ε q = 1 if q ≡ 1 (mod 4) and ε q = i if q ≡ 3 (mod 4) (see [13, Theorem 3.3]). Therefore, the contribution from the remaining frequencies j, k is exactly ε 3q q 1/2 k−j −k j ) ( ) e q (− jc 1 )e q (−kc 2 ). ( )( q q q 1⩽ j,k⩽q−1 ∑ j≠k By introducing a change of variables k = l j, this expression is equal to (3.11) ε 3q q 1/2 ( −1 j l l −1 ) ∑ ( ) ( ) ∑ ( ) e q ( j(−c 1 − l c 2 )). q 2⩽l ⩽q−1 q q 1⩽ j⩽q−1 q One can evaluate the inner sum of (3.11) using (3.10). Indeed, for an arbitrary integer m and an arbitrary quadratic nonresidue h, we have ⎞ j 1⎛ 2 2 ∑ ( ) e q ( jm) = ∑ e q (mn ) − ∑ e q (mn h) 2 ⎝1⩽n⩽q−1 ⎠ 1⩽n⩽q−1 1⩽ j⩽q−1 q = = ⎞ 1⎛ 2 2 ∑ e q (mn ) − ∑ e q (mn h) 2 ⎝0⩽n⩽q−1 ⎠ 0⩽n⩽q−1 1 m hm √ (( ) − ( )) ε q q 2 q q =( m √ ) ε q q. q Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 14 N. Technau and A. Walker Plugging this expression into (3.11), we establish that (3.11) is equal to ∑ ( 2⩽l ⩽q−1 l(l − 1)(l c 2 + c 1 ) ). q √ This is O( q) by Hasse (see [13, equation (14.32)]), provided that neither q∣c 1 , q∣c 2 , nor q∣(c 1 + c 2 ). In the singular cases, if q divides c 1 , we end up with ( l −1 c2 ) ∑ ( ), q 2⩽l ⩽q−1 q which is equal to −( −cq 2 ). If q∣(c 1 + c 2 ), we end up with ( c2 l ) ∑ ( ), q 2⩽l ⩽q−1 q which is equal to −( cq2 ). The final case, when q∣c 2 , follows easily from the case q∣c 1 after permuting the variables x, y, z in the original expression for A 0 (q, c 1 , c 2 ). ∎ Therefore, (M/q)3 A 0 (q, c 1 , c 2 ) ≍ M 3 q−2 , and this is the expected size of A(M, q, c 1 , c 2 ). We note, then, that the M 3 term in (3.8) represents “square-root cancellation on average” for the size of ∆(M, q, c 1 , c 2 ). Moreover, Lemma 3.1 is close to best possible, at least for certain ranges of M. We would like to emphasize that here and throughout, M denotes a positive integer. Indeed, if M < λq 2/3 and λ is a suitably small constant, then we have the matching lower bound 2 3 ∑ ∆(M, q, c 1 , c 2 ) ≫ M . (3.12) c 1 ,c 2 ⩽q Proof Certainly, 3 ∑ A(M, q, c 1 , c 2 ) = M . c 1 ,c 2 ⩽q Furthermore, 2 ∑ A(M, q, c 1 , c 2 ) = c 1 ,c 2 ⩽q ∑ 1. x , y,z,x ′ , y ′ ,z ′ ⩽M x 2 −(x ′ )2 ≡y 2 −(y ′ )2 ≡z 2 −(z ′ )2 (mod q) By using the divisor bound to control the terms arising when x 2 − (x ′ )2 ≠ 0, this sum is at most O(q o(1) M 2 q−1 + M 3 ), which is certainly at most O(M 3 ). Therefore, by Cauchy–Schwarz, ⎛ ⎞ ⎛ 2⎞ ⩾ ∑ A(M, q, c 1 , c 2 ) ∑ A(M, q, c 1 , c 2 ) ⎝c 1 ,c 2 ⩽q ⎠ ⎝c 1 ,c 2 ⩽q ⎠ 2 ∑ 1 A(M,q,c 1 ,c 2 )⩾1 c 1 ,c 2 ⩽q Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. −1 ≫ M3. On the triple correlations of fractional parts of n 2 α If Mq−2/3 is sufficiently small, then (M/q)3 A 0 (q, c 1 , c 2 ) < 15 1 2 for all c 1 , c 2 . Hence, 3 ∑ ∆(M, q, c 1 , c 2 ) ≫ ∑ 1 A(M,q,c 1 ,c 2 )⩾1 ≫ M 2 c 1 ,c 2 ⩽q c 1 ,c 2 ⩽q as claimed. ∎ Rudnick, Sarnak, and Zaharescu [24] also considered ∆(M, q, c 1 , c 2 ). By Fourierexpanding the cutoff x, y, z ⩽ M, they derived RR RRRR RRR RRR RRR R 3 RRR RR ∆(M, q, c 1 , c 2 ) ⩽ e q (b 1 x + b 2 y + b 3 z)RRRR , ∑ ∏ ∣̂1[M] (b i )∣ RRRR ∑ RRR RR 0⩽x , y,z⩽q−1 0⩽b 1 ,b 2 ,b 3 ⩽q−1 i=1 RRR 2 2 RRR b≠0 RRRx 2 −y2 ≡c 1 (mod q) RRR RR y −z ≡c 2 (mod q) RR where 1 (3.13) 1̂ ∑ e q (−bx). [M] (b) = q x⩽M In expression (9.15) of [24] they used the Weil bound,5 ending up with a bound of (3.14) ∆(M, q, c 1 , c 2 ) ≪ q 1/2 (log q)3 , in the nondegenerate cases. Comparing this result to Lemma 3.1, estimate (3.14) implies (3.15) 2 3+o(1) , ∑ ∆(M, q, c 1 , c 2 ) ≪ q c 1 ,c 2 ⩽q which is weaker than Lemma 3.1. We note, therefore, that the proof of Lemma 3.1 must utilize the extra averaging in c 1 and c 2 in a critical way. In the introduction, we promised to explain how the threshold N 3/5 in Theorem 1.7 arises from the arguments of [24], and now seems to be an appropriate moment. Indeed, in order to analyze (3.4), it would be enough to show that ∆(N , q, ar 1 , ar 2 ) = o((N/q)3 A 0 (q, ar 1 , ar 2 )), because then one could replace A(N , q, ar 1 , ar 2 ) with (N/q)3 A 0 (q, ar 1 , ar 2 ) (and then evaluate these latter terms explicitly using Lemma 3.3). Using the bound (3.14), this is only possible when N 3 q−2 > q 1/2 (log q)3 , that is, provided that N ⩾ q 5/6+o(1) . This, one notes, is the same as the threshold from Theorem 4 of [24] taken with m = 3 2 (for triple correlations). Noting that N ≈ q 2+β , this approach succeeds provided that 2/(2 + β) > 5/6, that is, provided that β < 2/5. From the definition of β, this implies that L must be at least N 3/5 . Having finished our diversion on the subject of Lemma 3.1 (whose proof is deferred to Section 5), let us return to the main argument, namely the proof of Theorem 5 For triple correlations, the relevant curve has genus 1, so this is, in fact, the same Hasse bound as we used in Lemma 3.3. Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 16 N. Technau and A. Walker 2.2. From now on, we assume α satisfies Lemma 3.2. Putting this information into expression (3.4), we derive N 2 q−3 ∑ (r 1 ,r 2 )∈S r 1 r 2 ≠0 r 1 +r 2 ≠0 − (3.16) A 0 (q, ar 1 , ar 2 ) − O α,β,η,s,t (L 2 q−η ) ⩽ R′3 (α, L, N , g s,t ) ⩽ N 2 q−3 ∑ (r 1 ,r 2 )∈S r 1 r 2 ≠0 r 1 +r 2 ≠0 + A 0 (q, ar 1 , ar 2 ) + O α,β,η,s,t (L 2 q−η ). To estimate the terms in the expression (3.16), we use Lemma 3.3. Indeed, (3.17) ∑ N 2 q−3 (r 1 ,r 2 )∈S r 1 r 2 ≠0 r 1 +r 2 ≠0 ± A 0 (q, ar 1 , ar 2 ) = N 2 q−3 ∑ (r 1 ,r 2 )∈S r 1 r 2 ≠0 r 1 +r 2 ≠0 ± (q + O(q 1/2 )). The size of S ± is given by ((t 1 − s 1 ) N2 qL N2 qL ± 2 1−η + O(1)) ((t 2 − s 2 ) ± 2 1−η + O(1)) . N q N q From relation (3.2), one observes that N 2 q−1+η = o β,η ( qL ), N as N → ∞, and therefore, the size of S ± is seen to be ∣S ± ∣ = (1 + o β,η,s,t (1))(t 1 − s 1 )(t 2 − s 2 )q 2 L 2 N −2 , (3.18) as N → ∞. The estimate (3.18) remains after we remove those pairs (r 1 , r 2 ) ∈ S ± with r 1 r 2 = 0 or r 1 + r 2 = 0. Thus, returning to (3.17), we conclude that N 2 q−3 ∑ (r 1 ,r 2 )∈S ± r 1 r 2 ≠0 r 1 +r 2 ≠0 A 0 (q, ar 1 , ar 2 ) = (1 + o β,η,s,t (1))L 2 (t 1 − s 1 )(t 2 − s 2 ). Substituting this estimate into (3.16), we derive Theorem 2.2 as required. ∎ What remains is to verify that almost all α admit an approximation a/q of the form required in (3.1) and (3.2), and to prove Lemmas 3.1 and 3.2. 4 Approximation with prime denominator To derive a suitable approximation a/q with prime denominator, we use the following quantitative version of (a generalized) Khintchine’s theorem, due to Harman. Theorem 4.1 [11, Theorem 4.2] that Let ψ ∶ N → (0, 1) be a nonincreasing function such Ψ (N) = ∑ ψ (n) n≤N Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 17 is unbounded. For B an infinite set of integers, let S (B, α, N) denote the number of n ≤ N, with n ∈ B, such that ∥nα∥ < ψ (n). Then, for almost all α, we have S (B, α, N) = 2Ψ(N , B) + O ε ((Ψ(N)) 2 (log Ψ(N))2+ε ), 1 (4.1) for each ε > 0, where Ψ (N , B) = ∑ n∈B∩[N] ψ (n) . Moreover, the implied constant in (4.1) is uniform in α. From this, we deduce the following lemma. Lemma 4.2 Let η ∈ (0, 1). Then, for almost all α and for all N ⩾ N 0 (η), there is a prime q satisfying (4.2) ∥qα∥ < 1 , N 1−η and N ⩽ q ≤ 2N . Proof We let B denote the set of primes, and ψ (n) = n−1+η . Then, Ψ(N) ∼ η−1 N η and (by the prime number theorem) Ψ(N , B) ∼ η−1 N η (log N)−1 . So, the asymptotic formula (4.1) shows that, for almost all α, one has that S (B, α, N) = η 2N η (1 + o (1)) + O(η−1 N 2 (log N)3 ), η log N with the implied constant uniform in α. From this, it immediately follows that if N is sufficiently large in terms of η, then S(B, α, 2N) − S(B, α, N) > 0, as required. ∎ Therefore, an approximation a/q may be found, with q prime, that satisfies (3.1) and (3.2). 5 Proof of Lemma 3.1 Expanding the square, we have 2 ∑ ∆(N , q, c 1 , c 2 ) = S 1 − 2S 2 + S 3 , c 1 ,c 2 ⩽q where S 1 = ∑ A(M, q, c 1 , c 2 )2 , c 1 ,c 2 ⩽q M S2 = ( ) q 3 S3 = ( 6 M ) q ∑ A(M, q, c 1 , c 2 )A 0 (q, c 1 , c 2 ), c 1 ,c 2 ⩽q 2 ∑ A 0 (q, c 1 , c 2 ) . c 1 ,c 2 ⩽q Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 18 N. Technau and A. Walker We can write each S i as the number of solutions to certain equations, namely S1 = ∑ 1, 1⩽x , y,z⩽M 1⩽x ′ , y ′ ,z ′ ⩽M 2 2 x −y ≡(x ′ )2 −(y ′ )2 (mod q) y 2 −z 2 ≡(y ′ )2 −(z ′ )2 (mod q) M S2 = ( ) q 3 S3 = ( 6 M ) q ∑ 1, ∑ 1. 1⩽x , y,z⩽M 1⩽x ′ , y ′ ,z ′ ⩽q x 2 −y 2 ≡(x ′ )2 −(y ′ )2 (mod q) y 2 −z 2 ≡(y ′ )2 −(z ′ )2 (mod q) 1⩽x , y,z⩽q 1⩽x ′ , y ′ ,z ′ ⩽q x 2 −y 2 ≡(x ′ )2 −(y ′ )2 (mod q) y 2 −z 2 ≡(y ′ )2 −(z ′ )2 (mod q) Expanding the cutoffs 1 ⩽ x, y, z, x ′ , y′ , z ′ ⩽ M in terms of additive characters, we have S1 = b=(b 1 ,b 2 ,b 3 ,b 4 ,b 5 ,b 6 ), 0⩽b 1 ,b 2 ,b 3 ,b 4 ,b 5 ,b 6 ⩽q−1 where ∑ S(b, q) ∶= S(b, q) ∏ 1̂ [M] (b i ), 6 ∑ x , y,z⩽q x ′ , y ′ ,z ′ ⩽q ′ 2 x −y ≡(x ) −(y ′ )2 (mod q) y 2 −z 2 ≡(y ′ )2 −(z ′ )2 (mod q) 2 i=1 e q (b ⋅ (x, y, z, x ′ , y′ , z ′ )) 2 and 1̂ [M] is as in (3.13). The contribution from the term with b = 0 is equal to S 3 . Performing the same expansion on S 2 , we see that the terms arising from b = 0 cancel, and we are left with the bound (5.1) S 1 − 2S 2 + S 3 ⩽ ∑ (∏ ∣1̂ [M] (b i )∣) ∣S(b, q)∣. 6 0⩽b 1 ,b 2 ,b 3 ,b 4 ,b 5 ,b 6 ⩽q−1 b≠0 i=1 Our task moves to bounding ∣S(b, q)∣. We have the trivial bound ∣S(b, q)∣ ⩽ q 6 , but because q is prime and b ≠ 0, we will be able to improve on this bound substantially. Our approach will be elementary. We begin with writing 1 ′ ′ ′ S(b, q) = 2 ∑ ∑ e q (b ⋅ (x, y, z, x , y , z ) q j,k⩽q x , y,z⩽q x ′ , y ′ ,z ′ ⩽q (5.2) + j(x 2 − y 2 − (x ′ )2 + (y′ )2 ) + k(y 2 − z 2 − (y′ )2 + (z ′ )2 )). Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 19 As we did when estimating A 0 (q, c 1 , c 2 ), let us first consider the contribution from those terms when j = 0, k = 0, or j = k. This is 1 ∑ q 2 k⩽q + + (5.3) − ∑ x , y,z⩽q x ′ , y ′ ,z ′ ⩽q 1 ∑ q 2 j⩽q 1 ∑ q 2 j⩽q 2 q2 e q (b ⋅ (x, y, z, x ′ , y′ , z ′ ) + k(y 2 − z 2 − (y′ )2 + (z ′ )2 )) ∑ e q (b ⋅ (x, y, z, x ′ , y′ , z ′ ) + j(x 2 − y 2 − (x ′ )2 + (y′ )2 )) ∑ e q (b ⋅ (x, y, z, x ′ , y′ , z ′ ) + j(x 2 − z 2 − (x ′ )2 + (z ′ )2 )) x , y,z⩽q x ′ , y ′ ,z ′ ⩽q x , y,z⩽q x ′ , y ′ ,z ′ ⩽q ∑ x , y,z⩽q x ′ , y ′ ,z ′ ⩽q e q (b ⋅ (x, y, z, x ′ , y′ , z ′ )). The first three of these terms devolve into the exponential sums involved in the pair correlations of the fractional parts of αn 2 considered by Heath-Brown in [12]. Indeed, note that, by completing the square in the variables y, y′ , z, z ′ , we have (5.4) 1 ∑ q 2 k⩽q ∑ x , y,z⩽q x ′ , y ′ ,z ′ ⩽q e q (b ⋅ (x, y, z, x ′ , y′ , z ′ ) + k(y 2 − z 2 − (y′ )2 + (z ′ )2 )) is equal to 1q∣b 1 1q∣b 4 ∑ ∣G(k)∣4 e q ( k⩽q−1 −b 22 + b 32 + b 52 − b 62 ), 4k where G(k) = ∑ e q (kx 2 ) x⩽q is the Gauss sum, as before, and 1q∣b abbreviates the indicator function of the condition q ∣ b. We also use k1 to refer to the multiplicative inverse of k modulo q. Because ∣G(k)∣ = q 1/2 by the standard evaluation (3.10), the term (5.4) is equal to (5.5) ⎧ q3 − q2 ⎪ ⎪ ⎪ ⎪ 2 ⎨−q ⎪ ⎪ ⎪ ⎪ ⎩0 if q∣b 1 , q∣b 4 , q∣(−b 22 + b 32 + b 52 − b 62 ), if q∣b 1 , q∣b 4 , q ∤ (−b 22 + b 32 + b 52 − b 62 ), if q ∤ b 1 or q ∤ b 4 . We compute that the overall size of (5.3) is (5.6) ⎧ O(q 3 ) ⎪ ⎪ ⎪ ⎪ ⎪O(q 3 ) ⎪ ⎪ ⎨ ⎪ O(q 3 ) ⎪ ⎪ ⎪ ⎪ 2 ⎪ ⎪ ⎩O(q ) if q∣b 1 , q∣b 4 , q∣(−b 22 + b 32 + b 52 − b 62 ), if q∣b 2 , q∣b 5 , q∣(−b 32 + b 12 + b 62 − b 42 ), if q∣b 3 , q∣b 6 , q∣(−b 12 + b 22 + b 42 − b 52 ), otherwise. Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 20 N. Technau and A. Walker Now, consider the contribution to (5.2) when j ≠ 0, k ≠ 0, and j ≠ k. By completing the square again, this contribution is equal to 1 q2 2 2 2 ∑ ∣G( j)∣ ∣G(k − j)∣ ∣G(−k)∣ j,k⩽q−1 j≠k ⋅ e q (− (5.7) b 12 b 22 b2 b2 b 52 b2 − − 3 + 4+ + 6 ). 4 j 4(k − j) 4k 4 j 4(k − j) 4k This is equal to q ∑ eq ( j,k⩽q−1 j≠k f ( j, k) ), g( j, k) where f ( j, k) = j2 (b 32 − b 62 ) + jk(b 12 − b 22 − b 32 − b 42 + b 52 + b 62 ) + k 2 (−b 12 + b 42 ) and g( j, k) = 4 jk(− j + k). By letting l = j/k (mod q), we can reparameterize this exponential sum as q ∑ e q (− (5.8) k⩽q−1 2⩽l ⩽q−1 f (l , 1) ). kg(l , 1) This, in turn, is equal to (5.9) q(q − 1)R − q(q − 2 − R), where R is the number of ℓ in the range 2 ⩽ ℓ ⩽ q − 1 for which f (ℓ, 1) = 0 (mod q). There are two relevant cases. If q ∣ (b 12 − b 42 ), q ∣ (b 22 − b 52 ), and q ∣ (b 32 − b 62 ), then R = q − 2, and so (5.9) is exactly equal to q(q − 1)(q − 2). Otherwise, f (ℓ, 1) is a nonzero polynomial of degree at most two in ℓ, and so R = O(1). Hence, (5.9) is O(q 2 ). If we combine our knowledge of the sum (5.8) with our knowledge of the terms with j = 0, k = 0, j = k from (5.6), this yields: ⎧ ⎪ O(q 3 ) ⎪ ⎪ ⎪ ⎪ ⎪ O(q 3 ) ⎪ ⎪ ⎪ ⎪ (5.10) ∣S(b, q)∣ = ⎨O(q 3 ) ⎪ ⎪ ⎪ ⎪ O(q 3 ) ⎪ ⎪ ⎪ ⎪ 2 ⎪ ⎪ ⎩O(q ) if q∣b 1 , q∣b 4 , q∣(−b 22 + b 32 + b 52 − b 62 ), if q∣b 2 , q∣b 5 , q∣(−b 32 + b 12 + b 62 − b 42 ), if q∣b 3 , q∣b 6 , q∣(−b 12 + b 22 + b 42 − b 52 ), if q divides all of (b 12 − b 42 ), (b 22 − b 52 ), and (b 32 − b 62 ), otherwise. Plugging this bound into (5.1), we infer that (5.11) S 1 − 2S 2 + S 3 ≪ q 3 (T1 + T2 ) + q 2 T3 , Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 21 where (after a straightforward relabeling of the variables) T1 = −q/2<b 1 ,b 2 ,b 3 ,b 4 ,b 5 ,b 6 <q/2 q∣b 1 ±b 4 ,q∣b 2 ±b 5 ,q∣b 3 ±b 6 b≠0 T2 = ∑ −b/2<b 1 ,b 2 ,b 3 ,b 4 <q/2 q∣b 12 −b 22 +b 32 −b 42 b≠0 T3 = (∏ ∣1̂ [M] (b i )∣) , 6 ∑ i=1 M2 4 ̂ (∏ ∣1[M] (b i )∣) , q 2 i=1 (∏ ∣1̂ [M] (b i )∣) . 6 ∑ −q/2<b 1 ,b 2 ,b 3 ,b 4 ,b 5 ,b 6 ⩽q/2 b≠0 i=1 Indeed, because q is prime, if q∣b 12 − b 42 , then q∣(b 1 + b 4 ) or q∣(b 1 − b 4 ). This is one of the places in which the assumption that q is prime is particularly convenient (the other being the reparameterization of j and k to get (5.8)). For −q/2 < b < q/2, recall the standard bound ∣1̂ [M] (b)∣ ≪ min ( M 1 , ), q ∣b∣ which produces the estimate ∑ −q/2<b<q/2 ∣1̂ [M] (b)∣ ≪ ∑ ∣b∣<q/M M 1 + ≪ log q. ∑ q q/M⩽∣b∣<q/2 ∣b∣ In the sum defining T1 , if (b 1 , b 2 , b 3 ) are fixed, then there are O(1) possibilities for (b 4 , b 5 , b 6 ). Similarly, in T2 , once (b 1 , b 2 , b 3 ) are fixed, then there are at most two possibilities for b 4 . Hence, T1 , T2 ≪ M3 (log q)3 q3 and T3 ≪ (log q)6 , because M < q. Substituting these bounds into (5.11), the proof of Lemma 3.1 is complete. ∎ 6 Proof of Lemma 3.2 Let β and η be as in the statement of Lemma 3.2. We begin the proof with another auxiliary lemma, which concerns the quantity ∆∗β (q, c 1 , c 2 ) = max2 ∣∆(M, q, c 1 , c 2 )∣. M⩽q 2+β Lemma 6.1 Let q be an odd prime. Then, we have ∗ 2 o(1) ∑ ∆ β (q, c 1 , c 2 ) ≪ q q 2+β . 7−β c 1 ,c 2 ⩽q Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 22 N. Technau and A. Walker Proof We will deduce this result from Lemma 3.1. Taking 1 ⩽ K ⩽ M ⩽ q, we begin by seeking a bound on (6.1) 2 ∑ (A(M + K, q, c 1 , c 2 ) − A(M, q, c 1 , c 2 )) . c 1 ,c 2 ⩽q First, A(M + K, q, c 1 , c 2 ) − A(M, q, c 1 , c 2 ) is at most ∣{x ∈ (M, M + K], y, z ⩽ M + K ∶ x 2 − y 2 ≡ c 1 (mod q), y 2 − z 2 ≡ c 2 (mod q)}∣ +∣{y ∈ (M, M + K], x, z ⩽ M + K ∶ x 2 − y 2 ≡ c 1 (mod q), y 2 − z 2 ≡ c 2 (mod q)}∣ +∣{z ∈ (M, M + K], x, y ⩽ M + K ∶ x 2 − y 2 ≡ c 1 (mod q), y 2 − z 2 ≡ c 2 (mod q)}∣. Let E 1 (q, c 1 , c 2 ) refer to the first quantity, E 2 (q, c 1 , c 2 ) refer to the second quantity, and E 3 (q, c 1 , c 2 ) refer to the third quantity. We have that expression (6.1) is at most a constant times 2 2 2 ∑ (E 1 (q, c 1 , c 2 ) + E 2 (q, c 1 , c 2 ) + E 3 (q, c 1 , c 2 ) ). c 1 ,c 2 ⩽q By a change of variables, we may reduce consideration just to E 2 . Indeed, if x 2 − y 2 ≡ c 1 (mod q) and y 2 − z 2 ≡ c 2 (mod q), then z 2 − x 2 ≡ −c 1 − c 2 (mod q). Hence, E 1 (q, c 1 , c 2 ) = E 2 (q, −c 1 − c 2 , c 1 ), and so 2 2 ∑ E 1 (q, c 1 , c 2 ) = ∑ E 2 (q, c 1 , c 2 ) . c 1 ,c 2 ⩽q c 1 ,c 2 ⩽q A similar argument works for E 3 (q, c 1 , c 2 ). Now, ∑c 1 ,c 2 ⩽q E 2 (q, c 1 , c 2 )2 is equal to (6.2) ∣{y 1 , y 2 ∈ (M, M + K], x 1 , x 2 , z 1 , z 2 ⩽ M + K ∶ x 12 − y 12 ≡ x 22 − y 22 (mod q), }∣ . y 12 − z 12 ≡ y 22 − z 22 (mod q) Fixing integers k 1 , k 2 and integers y 1 , y 2 ∈ (M, M + K], we will now bound the number of solutions (x 1 , x 2 , z 1 , z 2 ) to the system of equations x 12 − x 22 = y 12 − y 22 + k 1 q, (6.3) z 22 − z 12 = y 22 − y 12 + k 2 q. If both y 12 − y 22 + k 1 q ≠ 0 and y 22 − y 12 + k 2 q ≠ 0 then, using the divisor bound, we get q o(1) solutions. If y 12 − y 22 + k 1 q = 0 and y 22 − y 12 + k 2 q ≠ 0, we get O(q o(1) M) solutions. If both y 12 − y 22 + k 1 q = 0 and y 22 − y 12 + k 2 q = 0, we get O(M 2 ) solutions. Note that, in such a case, we must have k 1 = −k 2 . Regarding the other conditions on (y 1 , y 2 , k 1 , k 2 ), we note that the variables k 1 , k 2 are both restricted to intervals of length O(M 2 /q + 1). Furthermore, if k 1 ≠ 0, then the total number of solutions of y 1 , y 2 , k 1 to y 12 − y 22 = k 1 q, with y 1 , y 2 ∈ (M, M + K], is O(q o(1) (KM/q + 1)), because ∣y 12 − y 22 ∣ = O(KM). If, however, k 1 = 0, then of course Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 23 y 1 = y 2 , so there are O(K) solutions here. Thus, summing over (y 1 , y 2 , k 1 , k 2 ), we bound (6.2) above by ⎛ M2 M2 KM M2 q o(1) K 2 ( + 1) + M ( + 1) ( + 1) + KM ( + 1) q q q q ⎝ 2 + M2 ( ⎞ KM + 1) + KM 2 . q ⎠ Because K ⩽ M ⩽ q, we may simplify the above and conclude that (6.4) 2 o(1) 2 4 −2 2 ∑ (A(M + K, q, c 1 , c 2 ) − A(M, q, c 1 , c 2 )) ≪ q (K M q + KM ). c 1 ,c 2 ⩽q One may think of this upper bound as being given by the sum of an average contribution and a diagonal contribution. From (6.4), together with previous lemmas, it turns out that one can control ∑ max ∆(M + H, q, c 1 , c 2 )2 . 1⩽c 1 ,c 2 ⩽q 0⩽H⩽K Indeed, because ∣(a 1 − a 2 )2 − (b 1 − b 2 )2 ∣ ≪ (a 1 − b 1 )2 + (a 2 − b 2 )2 for all reals a 1 , b 1 , a 2 , b 2 , we have that ∑ (6.5) max ∆(M + H, q, c 1 , c 2 )2 − ∑ ∆(M, q, c 1 , c 2 )2 c 1 ,c 2 ⩽q 0⩽H⩽K c 1 ,c 2 ⩽q is as most a constant times 2 ∑ (A(M + K, q, c 1 , c 2 ) − A(M, q, c 1 , c 2 )) c 1 ,c 2 ⩽q + max ( (6.6) 0⩽H⩽K (M + H)3 − M 3 ) q3 2 2 ∑ A 0 (q, c 1 , c 2 ) . c 1 ,c 2 ⩽q From expression (6.4) and Lemma 3.3, we conclude that (6.6) is at most q o(1) (K 2 M 4 q−2 + KM 2 ) again. Finally, using Lemma 3.1 combined with (6.5), we end up with the bound (6.7) ∑ max ∆(M + H, q, c 1 , c 2 )2 ≪ q o(1) (M 3 + q 2 ) + q o(1) K 2 M 4 q−2 . c 1 ,c 2 ⩽q 0⩽H⩽K (The KM 2 term has been absorbed into the M 3 term.) Before we use (6.7) to prove the lemma, we need the bound (6.8) ∑ max ∆(H, q, c 1 , c 2 )2 ≪ q o(1) (K 6 q−2 + K 3 ). c 1 ,c 2 ⩽q 0⩽H⩽K This may be proved by an identical analysis to the one above, proceeding with M = 0. Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 24 N. Technau and A. Walker Now, returning to the original object of the lemma, we may cover {n ∈ N ∶ 1 ⩽ n ⩽ q 2+β } 2 by the interval [1, K] together with at most q 2+β K −1 other intervals, each of the form [M, M + K] with K ⩽ M. Thus, 2 ∗ 2 o(1) 6 −2 3 ∑ ∆ β (q, c 1 , c 2 ) ⩽ q (K q + K ) c 1 ,c 2 ⩽q ∑ + l ⩽q ⩽q o(1) 2 2+β ∑ K −1 max c 1 ,c 2 ⩽q X∈[l K ,(l +1)K) (K q 6 −2 + K 3 ) + q o(1) ∆(X, q, c 1 , c 2 )2 ∑ 2 l ⩽q 2+β K −1 (l 3 K 3 + q 2 + l 4 K 6 q−2 ) ⩽ q o(1) (K 6 q−2 + K 3 ) + q o(1) (q 2+β K −1 + q 2+β +2 K −1 + q 2+β −2 K). 8 10 2 Because β ⩽ 1 always, we have q 2+β +2 ⩽ q 2+β , which allows us to reduce matters to the bound 8 2 q o(1) (K 6 q−2 + K 3 + q 2+β K −1 + q 2+β −2 K). 8 10 Optimizing in K, we find that the minimum value is achieved when K ≍ q 2+β , in which case the third and fourth terms have the same order of magnitude. We conclude that 1+β ∗ 2 o(1) ∑ ∆ β (q, c 1 , c 2 ) ≪ q q 2+β , 7−β c 1 ,c 2 ⩽q and the lemma is proved. ∎ Proof of Lemma 3.2 Let C be a suitably large absolute constant. For all odd prime q, and integers a such that (a, q) = 1, define (6.9) D β,η (a, q) ∶= ∑ r 1 ,r 2 ≠0 2−β ∣r 1 ∣⩽q 2+β ∣r 2 ∣⩽q ∆∗β (q, ar 1 , ar 2 ). +C η 2−β +C η 2+β Now, suppose α ∈ [0, 1] and fractions a/q and N satisfy (3.1) and (3.2). We claim that, if C is large enough, and if q is large enough in terms of s, t, and η, it follows that ∑ (6.10) (r 1 ,r 2 )∈S + r 1 r 2 ≠0 r 1 +r 2 ≠0 ∆(N , q, ar 1 , ar 2 ) ⩽ D β,η (a, q). Indeed, because N ⩽ q 2+β , we have 2 ∆(N , q, ar 1 , ar 2 ) ⩽ ∆∗β (q, ar 1 , ar 2 ). In addition, we have both 2−β 2−β 2+β 2 qL ≪ N 2 −β+11η ≪ q( 2 +11η) 2+β + ≪ q 2+β +C η N Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 25 and N 2 q η−1 ≪ N 2−(1−η)( 2+β 2 +10η) ≪ q( 2−β 2 2 +C η) 2+β 2−β ≪ q 2+β +C η . Referring to the definition (3.6) of S + , we have thus settled the inequality (6.10). Therefore, to prove Lemma 3.2, it suffices to show that, for almost all α ∈ [0, 1], for all (a, q, N) satisfying (3.1) and (3.2), D β,η (a, q)L−2 N −1 ≪α,β,η q−η . (6.11) In fact, by (3.2), it suffices to show that, for almost all α ∈ [0, 1] and for all (a, q) satisfying (3.1), D β,η (a, q)q (6.12) −6+4β 2+β ≪α,β,η q−C η . This expression makes no mention of N or L, which will be a technical necessity in the Borel–Cantelli argument to come. To show (6.12), for each q, we define Bad β,η (q) = {a ⩽ q ∶ (a, q) = 1, D β,η (a, q)q −6+4β 2+β ⩾ q−C η }. It will be enough, then, to show that, for almost all α ∈ [0, 1], only finitely many of the fractions (a, q) satisfying (3.1) also satisfy a ∈ Bad β,η (q). To bound the size of Bad β,η (q), we first note that ∣ Bad β,η (q)∣ ⩽ q −6+4β 2+β +C η ∑ D β,η (a, q). 1⩽a⩽q (a,q)=1 Furthermore, we define f β,η (q, c 1 , c 2 ) to be the number of triples (a, r 1 , r 2 ) such that ar 1 ≡ c 1 (mod q), ar 2 ≡ c 2 (mod q), for which 1 ⩽ a ⩽ q, (a, q) = 1, r 1 r 2 ≠ 0, and ∣r 1 ∣, ∣r 2 ∣ ⩽ q 2+β +C η . We then have (6.13) ∣ Bad β,η (q)∣ ⩽ q −6+4β 2+β +C η ∑ 0⩽c 1 ,c 2 ⩽q−1 2−β f β,η (q, c 1 , c 2 )∆∗β (q, c 1 , c 2 ). To estimate (6.13), we need the following simple lemma about the size of f β,η (q, c 1 , c 2 ). Lemma 6.2 We have +2C η 2 (q + q 2+β +2C η )q o(1) . ∑ f β,η (q, c 1 , c 2 ) ≪ q 2+β 4−2β 4−2β c 1 ,c 2 ⩽q ∎ Proof We have that ∑c 1 ,c 2 ⩽q f β,η (q, c 1 , c 2 )2 is equal to the number of pairs of triples (a, r 1 , r 2 ), (b, s 1 , s 2 ) such that (6.14) ar 1 ≡ bs 1 (mod q), ar 2 ≡ bs 2 (mod q), Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 26 N. Technau and A. Walker with 1 ⩽ a, b ⩽ q, (a, q) = 1, (b, q) = 1, r 1 r 2 s 1 s 2 ≠ 0 and ∣r 1 ∣, ∣r 2 ∣, ∣s 1 ∣, ∣s 2 ∣ ⩽ q 2+β +C η . By multiplying the first equation by br 2 and the second equation by br 1 , one sees that, for every such solution, we must also have 2−β s 1 r 2 ≡ s 2 r 1 (mod q). (6.15) The mapping between the solutions to (6.14) and (6.15) is at most q-to- 1, because if the variables a, r 1 , r 2 , s 1 , s 2 are fixed, then b is uniquely determined in (6.14). Solutions to (6.15) are given by solutions (r 1 , r 2 , s 1 , s 2 , k) to s 1 r 2 − s 2 r 1 = kq, with k in the range 0 ⩽ ∣k∣ ≪ q 2+β −1+2C η . By the divisor bound, after k, s 1 and r 2 are fixed there at most q o(1) valid choices of s 2 and r 1 , so the total number of solutions to (6.15) is at most 2(2−β) q 2+β +2C η (1 + q 2+β −1+2C η ). 4−2β 4−2β Multiplying by q to get the number of solutions to (6.14), we prove the lemma. ∎ (The reader may wish to note that if β > 2/3, then the only valid value of k in the above is k = 0, which simplifies the remainder of the analysis for these cases.) Returning to (6.13), we have ∣ Bad β,η (q)∣ ⩽ q −6+4β 2+β +C η ≪q −6+4β 2+β 1/2 ⎛ ⎞ ⋅ ∑ ∆∗β (q, c 1 , c 2 )2 ⎝c 1 ,c 2 ⩽q ⎠ 1/2 ⋅ (q 2 + 2+β + q 2+β ) ⋅ q 2(2+β) ⋅ q C η+o(1) 1 2−β 4−2β 7−β ≪ (q 2(2+β) + q 2(2+β) )q C η+o(1) . 1+6β (6.16) ⎛ ⎞ ⋅ ∑ f η,β (q, c 1 , c 2 )2 ⎝c 1 ,c 2 ⩽q ⎠ 3+3β Here, we have used Lemmas 6.1 and 6.2 to go from the first line to the second line. Now, recall that β < 43 , which implies that 1 + 6β < 1. 2(2 + β) The other term is less severe, and, in fact, β < 1 implies 3 + 3β < 1. 2(2 + β) Because η is small enough, and C is absolute, we conclude that (6.17) ∣ Bad(q, β, η)∣ ≪ β,η q 1−2η . Now, we can finally complete the proof of Lemma 3.2 by using the first Borel– Cantelli lemma. Indeed, pick α uniformly at random in [0, 1], and for each prime q ⩾ 3, let E q,β,η be the event that there exists an a with 1 ⩽ a ⩽ q, (a, q) = 1, 1 a ∣α − ∣ < 2−η , q q Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 27 and a ∈ Bad β,η (q). Then, P(E q,β,η ) ⩽ 1 a 1 a µ (( − 2−η , + 2−η )) ≪ β,η q−1−η . q q q q a∈Bad β,η (q) ∑ Then, ∑q⩾3 P(E q,β,η ) < ∞, and so, with probability 1, only finitely many of the events E q,β,η occur. Thus, by our long chain of reductions, Lemma 3.2 follows. 7 Concluding remarks The proof of all of our main theorems is now complete. However, before concluding the paper, it is certainly worth us discussing whether β < 43 represents a natural limit of our approach. The chain of inequalities (6.16) is the critical moment of the entire proof, and this particular application of Cauchy and Schwarz is the main source of our loss in the range of β. Suppose that instead we had used the bound ∣ Bad β,η (q)∣ ⩽ q −6+4β 2+β +C η ⎛ ⎞ ∑ f β,η (q, c 1 , c 2 ) ⎝c 1 ,c 2 ⩽q ⎠ 1/2 ⎛ ⎞ ⋅ ∑ f β,η (q, c 1 , c 2 )∆∗β (q, c 1 , c 2 )2 ⎝c 1 ,c 2 ⩽q ⎠ (7.1) 1/2 . For simplicity of exposition here, we will assume that β > 2/3, that C = 0, and we will ignore all q o(1) terms. It is then easy to see that 1+ ∑ f β,η (q, c 1 , c 2 ) ≈ q 2(2−β) 2+β . c 1 ,c 2 ⩽q Combining this bound with Lemma 6.2, one may conclude that f β,η (q, c 1 , c 2 ) ≈ 1 for q 1+ 2+β pairs (c 1 , c 2 ), and is 0 otherwise. So, given what we know from Lemma 6.1 about the value of ∆∗β (q, c 1 , c 2 )2 averaged over all pairs c 1 and c 2 , it is not utterly unreasonable to hope that one could prove 2(2−β) 1+ ∗ 2 ∑ f β,η (q, c 1 , c 2 )∆ β (q, c 1 , c 2 ) ≪ β,η q 2+β ⋅ q 7−β (7.2) c 1 ,c 2 ⩽q 2(2−β) 2+β ⋅ q−2 = q 2+β , 9−4β provided that the weight of ∆∗β (q, c 1 , c 2 )2 does not concentrate on the support of f. Putting this bound into (7.1), one would then get ∣ Bad β,η (q)∣ ≪ β,η q 2(2+β) , 3+3β that is, only the second term from (6.16) would occur. As we have already remarked, we would then derive ∣ Bad β,η (q)∣ ≪ β,η q 1−2η , provided β < 1 and η is small enough. This estimate would expand the range of Theorem 1.4 all the way to L > N ε . Unfortunately, we have not been able to prove Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 28 N. Technau and A. Walker a version of Lemma 3.1, which includes the weight f β,η (q, c 1 , c 2 ) in the manner of expression (7.2). One also recalls that in our application of Borel and Cantelli, we did not need to bound ∣ Bad β,η (q)∣ uniformly for all q. One would be satisfied with ∑ q prime ∣ Bad β,η (q)∣ < ∞. q2 Thoughts move toward expressing the relevant exponential sums as an average of Kloosterman-type sums over the modulus q, which might be another route for future research. Our final remark is that if L → ∞ and N/L → ∞ as N → ∞, then, in the random model (1.4), the asymptotics are governed by the Central Limit Theorem. One can derive Z L,N − L d i st √ ÐÐ→ N(0, 1), L as N → ∞. Theorem 1.4 could then be considered as a first step toward √ showing that, for almost all α, the skewness of Wα,L,N satisfies E((Wα,L,N − L)/ L)3 → 0 as N → ∞, with L in a certain range. However, to show this asymptotic, one would need to be 3 able to extract the lower degree main term from EWα,L,N (which is 3L 2 , as in Section 2), and then subsequently show that the error term in Theorem 2.1 is, in fact, o(L 3/2 ), rather than merely o(L 3 ). Appendix A Pair correlations of the dilated squares at scale N −β In this section, we briefly indicate how one can deduce the following fact from the methods of the literature. Theorem A.1 Let ε ∈ (0, 41 ). Then, for almost all α ∈ [0, 1], for all 1 ⩽ L ⩽ N 1−ε , and for all (log N)−1 ⩽ s ⩽ log N, we have R 2 (α, L, N , 1[−s,s] ) = 2Ls(1 + O ε,α (N −ε/13 )). Note that this result immediately implies the estimate (2.3), by approximating the function f in (2.3) with a suitable step function. We have made no attempt to obtain the best possible error term in Theorem A.1, nor the largest admissible ranges for s and for L. One will observe from the proof that rather better bounds would certainly follow if one assumed at the outset that L were a slowly varying function of N. A version of Theorem A.1 follows from arguments of Aistleitner, Larcher, and Lewko [2] as well as from arguments of Rudnick [21] and Rudnick and Sarnak [23], by changing the relevant parameters. If one wanted an explicit characterization of the set of suitable α in terms of properties of its rational approximations, then one could also adapt the (much more involved) methods of Heath-Brown [12]. In particular, the material of the present section is in no way novel. However, we decided to add some explanations on how to deduce Theorem A.1, partly in order to make the exposition of Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 29 our previous arguments complete and self-contained, and partly in order to describe explicitly a suitable “sandwiching argument” for this result (expanding upon the description in [2]). We begin with the following auxiliary lemma (where again no attempt was made to obtain the best possible error term). Lemma A.2 For each m ∈ N, let N m = m 4 . Letting ε ∈ (0, 41 ), for each i in the range β −m ε/3 /10 ⩽ i ⩽ m ε/3 , let β m,i = i/m ε/3 and L m,i = N mm,i . Let s m be a real quantity that −ε/10 ε/10 satisfies m < sm < m for large enough m. Then, for almost all α ∈ [0, 1], for all m, and for all i in the range −m ε/3 /10 ⩽ i ⩽ m ε/3 , we have L m,i 13 ε ) . m 15 − 3 Proof of Lemma A.2 Let I = (γ, δ) be an arc on the torus R/Z such that 0 < γ − δ < 1. Let J ⩾ 1 be an integer. To proceed, we introduce trigonometric polynomials S ±J (x), of degree J, which approximate the indicator function χ I from above and below. Selberg, and also Vaaler, constructed such polynomials (cf. Montgomery [19, pp. 5–6]). Indeed, there exists R 2 (α, L m,i , N m , 1[−s m ,s m ] ) = 2s m L m,i + O α,ε ( (A.1) S ±J (x) = ∑ s±J ( j) e( jx), ∣ j∣⩽J satisfying S −J (x) ⩽ χ I (x) ⩽ S +J (x) such that s±J (0) = δ − γ ± 1 , J+1 ∣s±J ( j)∣ ⩽ (x ∈ R/Z), 1 1 + min (δ − γ, ) J+1 π ∣ j∣ (0 < ∣ j∣ ⩽ J) . For our purposes, given N m and some growth function w(N m ) ⩾ 1, to be specified later, we specify J m,i = ⌊N m w(N m )/L m,i ⌋, γ m,i = −s m L m,i /N m , and δ m,i = s m L m,i /N m . Note that the definitions of s m and L m,i in the statement of the theorem imply that these choices yield a valid arc on the torus. Then, defining R±2 (α, L m,i , N m , 1[−s m ,s m ] ) ∶= 1 Nm ∑ x≠y⩽N m S ±J m,i (α(x 2 − y 2 )), we can control R 2 from above and below via R−2 (α, L m,i , N m , 1[−s m ,s m ] ) ⩽ R 2 (α, L m,i , N m , 1[−s m ,s m ] ) ⩽ R+2 (α, L m,i , N m , 1[−s m ,s m ] ). (A.2) Let 1 E (L m,i , N m , s m ) ∶= ∫ R±2 (α, L m,i , N m , 1[−s m ,s m ] ) dα ± 0 be the expected value of (A.3) R±2 (α, L m,i , N m , 1[−s m ,s m ] ). It is easy to see that E (L m,i , N m , s m ) = 2s m L m,i + O(L m,i /w(N m )) + O(s m L m,i /N m ). ± Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 30 N. Technau and A. Walker Furthermore, by using orthogonality, we also see that the variance ± 2 ± ∫ (R 2 (α, L m,i , N m , 1[−s m ,s m ] ) − E (L m,i , N m , s m )) dα 1 0 of R±2 is at most ⩽ 1 2 Nm ∑ ∑ x 1 ≠y 1 ⩽N m 0<∣ j 1 ∣,∣ j 2 ∣⩽J m,i x 2 ≠y 2 ⩽N m j 1 (x 2 −y 2 )+ j 2 (x 2 −y 2 )=0 2 2 1 1 ∣s±J m,i ( j 1 )s±J m,i ( j 2 )∣. (Note that there are no contributions from terms in which j 1 = 0 and j 2 ≠ 0, because the condition j 2 (x 22 − y 22 ) = 0 cannot be satisfied.) Because ∣s±J m,i ( j)∣ ⩽ (2s m + 1)L m,i /N m for all j in the range 0 < ∣ j∣ ⩽ J m,i , we can bound the variance above by O((s m + 1)2 ) times (A.4) L 2m,i 4 Nm RRRR⎧ j 1 (x 12 − y 12 ) = j 2 (x 22 − y 22 ), ⎪ ⎪ RRR⎪ ⎪ 1 ⩽ xk ≠ yk ⩽ Nm , ⎪ RRR⎪ ( j 1 , x 1 , y 1 , j 2 , x 2 , y 2 ) ∈ Z6 ∶ m) RRR⎨ 0 < ∣ j k ∣ ⩽ N m Lw(N , ⎪ m,i ⎪ RRRR⎪ ⎪ ⎪ k = 1, 2 RR⎪ ⎩ ⎫ RRR ⎪ ⎪ RR ⎪ ⎪ ⎪ ⎪RRRR ⎬RR . RRR ⎪ ⎪ ⎪ RRR ⎪ ⎪ ⎪ ⎭RR Let us fix the first three variables above, that is, j 1 , x 1 , y 1 . Then, by writing x 22 − y 22 = d 1 d 2 , with d 1 = x 2 − y 2 and d 2 = x 2 + y 2 , we deduce that d k ∣ j 1 (x 12 − y 12 ) for o(1) k = 1, 2. By the divisor bound, there are at most N m many possibilities for d 1 , d 2 O(1) (provided that w(N m ) ⩽ N m ). Moreover, any choice of d 1 , d 2 uniquely determines the variables x 1 , y 1 via x 2 = (d 1 + d 2 )/2 and y 2 = (d 2 − d 1 )/2. Furthermore, we note o(1) that j 2 is determined up to ≪ N m many choices. The upshot is that given one of 3 the O(N m w(N m )/L m,i ) many admissible choices for j 1 , x 1 , y 1 , the second block of o(1) variables j 2 , x 2 , y 2 is determined up to O(N m ) many possibilities. Therefore, the −1+o(1) variance of R±2 is at most O((s m + 1)2 L m,i w(N m )N m ). ± For the ease of exposition, we let κm,i = ∣R±2 (α, L m,i , N m , 1[−s m ,s m ] ) − E ± (L m,i , N m , s m )∣. We infer, by Chebychev’s inequality, that P (α ∈ [0, 1] ∶ κ±m,i ⩾ L m,i (s m + 1)2 w(N m )3 N m )⩽ w(N m ) N m L m,i o(1) ⩽ (s m + 1)2 w(N m )3 N m Choose the growth function w(N m ) to be 9/10 ⎛ Nm ⎞ w(N m ) = 1+ε ⎝m ⎠ 1/3 . Therefore, because s m < m ε/10 , we conclude that P (α ∈ [0, 1] ∶ κ±m,i ⩾ 1 L m,i ) ≪ε 1+ ε . w(N m ) m 2 Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. o(1) 9/10 Nm . On the triple correlations of fractional parts of n 2 α 31 Summing over i in the range −m ε/3 /10 ⩽ i ⩽ m ε/3 , with the union bound, we get P (α ∈ [0, 1] ∶ ∃ i ∈ [−m ε/3 /10, m ε/3 ] s.t. κ±m,i ⩾ L m,i 1 ) ≪ε 1+ ε . w(N m ) m 6 Recalling our estimate (A.3), the first Borel–Cantelli lemma implies that, for almost every α ∈ [0, 1], the following relation holds for all m ⩾ 1 and all admissible i ∈ [−m ε/3 /10, m ε/3 ]: 1 Nm ∑ x≠y⩽N m R±2 (α, L m,i , N m , 1[−s m ,s m ] ) = 2s m L m,i + O α,ε ( sm L m,i ) + Oα ( ). w(N m ) Nm From (A.2), and substituting in the explicit growth function w(N m ), the lemma follows. ∎ Proof of Theorem A.1 We begin with the trivial observation that, by combining s and L into a single parameter, it is enough to show that, for almost all α ∈ [0, 1], for all N, and for all L in the range N −1/11 ⩽ L ⩽ N 1−ε/2 , (A.5) R 2 (α, L, N , 1[−1,1] ) = 2L(1 + O α,ε (N − 13 )). ε Now, for each N, choose m such that N m ≤ N < N m+1 , where N m = m 4 as in Lemma A.2. We put θ m = N m+1 /N m . Then, for any L, N m+1 R 2 (α, L, N m+1 , 1 N m+1 [−1,1] ) N N ⩽ θ m R 2 (α, L, N m+1 , 1θ m [−1,1] ), R 2 (α, L, N , 1[−1,1] ) ⩽ and similarly, R 2 (α, L, N , 1[−1,1] ) ⩾ Nm R 2 (α, L, N m , 1 N m [−1,1] ) ⩾ θ −1 ). m R 2 (α, L, N m , 1 θ −1 m [−1,1] N N For each N −1/11 ⩽ L ⩽ N 1−ε/2 , there exists an i in the range −m ε/3 /10 ⩽ i ⩽ m ε/3 such that L m,i ⩽ L ⩽ L m,i+1 , where L m,i is as in Lemma A.2. Now, we record that the upper and lower bounds above satisfy R 2 (α, L, N m , 1θ −1 ) ⩾ R 2 (α, L m,i , N m , 1θ −1 ), m [−1,1] m [−1,1] R 2 (α, L, N m+1 , 1θ m [−1,1] ) ⩽ R 2 (α, L m,i+1 , N m+1 , 1θ m [−1,1] ). Moreover, Lemma A.2 implies that there is a set Ω ε ⊂ [0, 1], with full measure, such that, if α ∈ Ω ε and ε ∈ (0, 41 ], then for all m ⩾ 1 and i in the range −m ε/3 /10 ⩽ i ⩽ m ε/3 , R 2 (α, L m,i , N m , 1θ m [−1,1] ) = 2θ m L m,i (1 + O α,ε (m− 15 + 3 )). 13 ε Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 32 N. Technau and A. Walker By using that L m,i+1 /L m,i = 1 + O ε (N −ε/13 ) and also that θ m = 1 + O(m−1 ), we infer that R 2 (α, L, N , 1[−1,1] ) ⩽ 2L(1 + O α,ε (N − 4 ))(1 + O α,ε (N − 13 ))(1 + O α,ε (m− 15 + 3 )) ε 1 13 ε ⩽ 2L(1 + O α,ε (N − 13 )). ε Similarly, we conclude that R 2 (α, L, N , 1[−1,1] ) ⩾ 2L(1 + O α,ε (N − 13 )). ε Combining these two estimates shows (A.5), thus completing the proof of Theorem A.1. ∎ Appendix B Discrepancy and k-point correlation functions at scale N −β The purpose of the present section is to record a few simple observations concerning the relationship between discrepancy and k-point correlation functions. Definition B.1 Let (x n )∞ n=1 be a sequence of points in [0, 1). Let k ⩾ 2 be a natural number, and let g ∶ R k−1 Ð→ [0, 1] be a compactly supported function. Then, the kth correlation function R k ((x n )∞ n=1 , L, N , g) is defined to be R k ((x n )∞ n=1 , L, N , g) ∶= 1 N ∑ n 1 ,...,n k ⩽N distinct g( N N N y 1 , y 2 , . . . , y k−1 ) , L L L where y i = {x n i − x n i+1 }sgn , for i = 1, . . . , k − 1, with {⋅}sgn ∶ R Ð→ (−1/2, 1/2] denoting the signed distance to the nearest integer. The main point we are conveying here is that, as expected, the correlations are controlled on the scales in which the discrepancy allows us to count points asymptotically. Lemma B.1 (a) Let D N denote the discrepancy of the sequence (x n )∞ n=1 in [0, 1), and suppose sup{g > 0 ∶ D N ≪ g N −g for all N} = γ > 0. Let k ⩾ 2 be a natural number, and let Y be a uniformly distributed random variable modulo 1. Let ε > 0 be suitably small in terms of γ, and for all N ∈ N and L ∈ R satisfying L ⩽ N, let W((x n )∞ n=1 , L, N) ∶= ∣{n ⩽ N ∶ x n ∈ [Y , Y + L/N] mod 1}∣. Then, if L is in the range N 1−γ+ε < L ⩽ N, we have (B.1) k k −ε/2 EW((x n )∞ )). n=1 , L, N) = L (1 + O ε (kN Furthermore, for all continuous functions g ∶ R k−1 Ð→ [0, 1] and for all L in the range N 1−γ+ε < L < N 1−ε , (B.2) k−1 R k ((x n )∞ ∫ g(w) dw, n=1 , L, N , g) = (1 + o g,ε,k (1))L as N → ∞, where the error term is independent of the choice of parameters L. Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 33 (b) If (a n )∞ n=1 is a strictly increasing sequence of positive integers, for almost every α ∈ [0, 1], for all N ∈ N and for all L ∈ R in the range N 2 +ε < L ⩽ N, the sequence 1 satisfies estimates (B.1) and (B.2). (αa n mod 1)∞ n=1 Remark: Part (b) of the lemma proves our earlier assertion (1.5). Proof Fix a small ε > 0 throughout this proof. For part (a), the proof of (B.1) is trivial. Indeed, by the discrepancy estimate, we have ∣{n ⩽ N ∶ x n ∈ [Y , Y + L/N] mod 1}∣ = L + O(N D N ) = L + O ε (N 1−γ+ε/2 ) = L(1 + O ε (N −ε/2 )). Raising to the kth power and averaging over Y, we obtain (B.1). To prove the correlation estimate (B.2), by approximating the function g by step functions, we see that it is enough to prove it in the case when g is the indicator function of a box [s 1 , t 1 ] × ⋅ ⋅ ⋅ × [s k−1 , t k−1 ]. We may assume, without loss of generality, that N is large enough, so that (t i − s i )L/N ⩽ 1 for all i ⩽ k − 1. (This is why it is important for the correlation estimate to preclude the case L = N.) Then, fixing n k , we see that R k ((x n )∞ n=1 , L, N , g) counts the number of x n k−1 ≠ x n k such that x n k−1 ∈ [s k−1 L/N + x n k , t k−1 L/N + x n k ] mod 1, times the number of x n k−2 ≠ x n k−1 , x n k such that x n k−2 ∈ [s k−2 L/N + x n k−1 , t k−2 L/N + x n k−1 ] mod 1, and so forth. By the discrepancy estimate, the total number of choices is ((t k−1 − s k−1 )L + O ε (LN −ε/2 )) × ((t k−2 − s k−2 )L + O ε (LN −ε/2 )) × ⋅ ⋅ ⋅ × ((t 1 − s 1 )L + O ε (LN −ε/2 )). Summing over all n k and then normalizing by 1/N, we have k−1 k−1 (1 + O ε, g (kN −ε/2 )), R k ((x n )∞ n=1 , L, N , g) = (∏(t i − s i )) L i=1 as desired. This proves part (a) of the assertion. The remaining part follows by recalling that a classical (and far more general) result of Erdős and Koksma [9, Theorem 2] furnishes −1/2+ε an upper bound on the discrepancy of (αa n mod 1)∞ , for each n=1 of the quality N fixed ε > 0 and for almost every α ∈ [0, 1]. ∎ Steinerberger [27] raised the question of whether “most sequences”6 have uniform pair correlations at some scale 0 < β < 1. The above part (b) answers Steinerberger’s question in a strong sense. Furthermore, it seems worthwhile to record the following 6 The precise meaning of the word “most” was left open for interpretation by Steinerberger, and was already put in quotation marks in the original paper. Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 34 N. Technau and A. Walker consequence. To state it, recall that for any irrational α ∈ [0, 1], there is a unique sequence of positive integers α n ≥ 1 such that 1 α = lim , 1 n→∞ α1 + 1 α2 + ⋱+ 1 αn where α n is called the nth partial quotient of α. Writing the fraction 1 =∶ 1 α1 + α2 + pn qn 1 ⋱+ 1 αn in lowest terms, we call q n a convergent denominator of α. Corollary B.2 Let α n be the nth partial quotient of the irrational number α ∈ [0, 1]. Given N ⩾ 1, let i(N) be such that the convergent denominator q i(N) of α satisfies q i(N) ⩽ N < q i(N)+1 . If, for each ε > 0, we have A N (α) = ∑ α j ≪ N ε , j⩽i(N) (αn)∞ n=1 then the Kronecker sequence satisfies (B.2), for any k ⩾ 2 and for any scale L such that N δ < L < N for any fixed δ ∈ (0, 1) (the o(1) term in (B.2) then also depends on δ). Proof As D N ((αn mod 1)∞ n=1 ) ≪ ε A N (α) (cf. [15, equation (3.18)]), Lemma B.1 completes the proof. ∎ It is well known (also with a higher-dimensional generalization due to Beck [3]) that, for almost every α ∈ [0, 1], the discrepancy of the Kronecker sequence is ≪ (log N)1+ε , for each ε > 0. Thus, the above corollary generalizes and sharpens a result of Weiı and Skill [28], stating that (ϕn)∞ Poissonian pair correlations on each n=1 has √ . Furthermore, the condition on scale β < 1, where ϕ denotes the Golden ratio 5+1 2 A N is known to be true for algebraic α, due to Roth’s famous approximation theorem. Finally, we remark that, if one so wished, one could readily replace the N ε -terms in Corollary B.2 by appropriate powers of logarithms. Appendix C The Rudnick–Sarnak obstruction In this final appendix, we detail an obstruction to studying the higher-order correlation function R k (α, L, N , g). This obstruction is a generalization of a fundamental observation of Rudnick and Sarnak [23, Section 4], which those authors made in the context of constant L and for the triple correlation function of (αn 2 mod 1)∞ n=1 . We address the more general situation of sequences of the shape (αn d mod 1)∞ n=1 , where Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 35 d ⩾ 2 is a fixed integer. We are also interested in identifying the full range of L in which the obstruction persists. To this end, for a compactly supported smooth test function g ∶ R k−1 → R, we define the correlation function R dk (α, L, N , g) by R dk (α, L, N , g) ∶= 1 N ∑ 1⩽x 1 ,...,x k ⩽N distinct g( N N d − x kd )}sgn ) , {α(x 1d − x 2d )}sgn , . . . , {α(x k−1 L L where {⋅}sgn ∶ R Ð→ (−1/2, 1/2] denotes the signed distance to the nearest integer. Rudnick and Sarnak’s approach to pair correlations, like in Appendix A, involves showing that 1 d ∫ (R 2 (α, L, N , g) − L 0 2 N −1 ̂ g (0)) dα = o g (L 2 ), N as N → ∞. Therefore, in order to make a similar approach work for the higher k-point correlation functions, one would need 1 d k−1 ∫ (R k (α, L, N , g) − L (C.1) 0 (N) k ̂ g (0)) dα = o d ,k, g (L 2(k−1) ), Nk 2 as N → ∞, where 0 is the zero vector in R k−1 and (N) k = N(N − 1) . . . (N − k + 1) abbreviates the kth falling factorial. Our purpose here is to show that, for certain ranges of d, k, and L, equation (C.1) cannot hold. Indeed, by applying the Poisson summation formula, one may expand R dk (α, L, N , g) into a Fourier series d (L, N , g)e(ℓα), R dk (α, L, N , g) = ∑ c k,ℓ ℓ∈Z d d with certain Fourier coefficients c k,ℓ (L, N , g). One may compute c k,ℓ (L, N , g) explick−1 d itly. For a given vector a ∈ Z , let S a,k,ℓ (N) denote the set of integer vectors x ∈ N k , with distinct components 1 ⩽ x i ⩽ N, satisfying the Diophantine equation d − x kd ) = ℓ. a 1 (x 1d − x 2d ) + ⋯ + a k−1 (x k−1 d One readily verifies that c k,ℓ (L, N , g) is of the special form d (L, N , g) = c k,ℓ (C.2) L k−1 L d g ( a) . ∑ ∣S a,k,ℓ (N)∣ ̂ k N a∈Zk−1 N From Parseval, we conclude that ∫ 1 0 (R dk (α, L, N , g) − L k−1 (N) k d ̂ (L, N , g)∣2 . g (0)) dα = ∑ ∣c k,ℓ Nk ℓ≠0 2 Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 36 N. Technau and A. Walker d Now, let ρ = N d+1+ε /L. We observe that if ∣ℓ∣ ⩾ ρ and S a,k,ℓ (N) ≠ 0, then ∥a∥∞ ⩾ N /L(k − 1). Therefore, from the rapid decay of ̂ g and the formula (C.2), we d conclude that ∣c k,ℓ (L, N , g)∣ = O ε, g,k,K (N −K ) for such ℓ. Hence, 1+ε d 2 −K d 2 ∑ ∣c k,ℓ (L, N , g)∣ = ∑ ∣c k,ℓ (L, N , g)∣ + O ε, g,k,K (N ). ℓ≠0 0<∣ℓ∣⩽ρ To estimate the right-hand side, we note that, by the Cauchy–Schwarz inequality, 2 d d (L, N , g)∣ . (L, N , g)∣2 ⩾ ∣ ∑ c k,ℓ ρ ∑ ∣c k,ℓ 0<∣ℓ∣⩽ρ 0<∣ℓ∣⩽ρ The sum inside the absolute value on the right-hand side equals, up to a term of size O ε, g,k,K (N −K ), the quantity ∑ c k,ℓ (L, N , g)e(0) − L d ℓ∈Z k−1 (N) k Nk ̂ g (0), which is R dk (0, L, N , g) + O g (L k−1 ). Assuming that g(0) ≠ 0 and L/N = o(1) as N → ∞, this is equal to N k−1 (1 + o(1))g(0), as N → ∞. Now, by combining these considerations, we conclude that 1 d k−1 ∫ (R k (α, L, N , g) − L 0 ≫ε, g,k (N) k ̂ g (0)) dα Nk 2 1 2(k−1) N ∣g(0)∣2 = LN 2k−d−3−ε ∣g(0)∣2 . ρ If (C.1) is to hold for all functions g, for each ε > 0, we must have L ≫ g,ε,k N 2k−d−3−ε 2k−3 = N 1− 2k−3 . d+ε In particular, for k = 3 and d = 2, the convergence (C.1) fails unless L ≫ N 6−3 = 1−ε N 3 . This justifies the statements we made in the introduction to the effect that the Rudnick–Sarnak obstruction for triple correlations extends throughout the range L < N 1/3 . We also note, however, that as soon as 6−5−ε d > 2k − 3, there is no such obstruction (for L constant in terms of N). The failure of (C.1) is an artefact of the integrand having a large spike when α ≈ 0 (and more generally, the integrand has large spikes when α is very well approximated by rationals with small denominators). One wonders whether L 2 -convergence (C.1) can be recovered by restricting to the “minor arcs,” but this also appears to be a difficult problem. Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. On the triple correlations of fractional parts of n 2 α 37 Acknowledgment We would like to thank Zeev Rudnick and Christoph Aistleitner for helpful comments relating to previous versions of the manuscript, and Andrew Granville and Dimitris Koukoulopoulos for many interesting conversations. We would also like to thank several anonymous referees for their suggestions and corrections. References [1] C. Aistleitner and G. Larcher, Metric results on the discrepancy of sequences (a n α)n≥1 modulo one for integer sequences (a n )n≥1 of polynomial growth. Mathematika 62(2016), no. 2, 478–491. [2] C. Aistleitner, G. Larcher, and M. Lewko, Additive energy and the Hausdorff dimension of the exceptional set in metric pair correlation problems. Israel J. Math. 222(2017), no. 1, 463–485. [3] J. Beck, Probabilistic Diophantine approximation, I. Kronecker sequences. Ann. of Math. (2) 140(1994), no. 2, 449–502. [4] M. V. Berry and M. Tabor, Level clustering in the regular spectrum. Proc. Roy. Soc. Lond. A Math. Phy. Sci. 356(1977), no. 1686, 375–394. [5] F. Boca and A. Zaharescu, Pair correlation of values of rational functions (mod p). Duke Math. J. 105(2000), no. 2, 267–307. [6] K.-L. Chung, An estimate concerning the Kolmogoroff limit distribution. Trans. Amer. Math. Soc. 67(1949), no. 1, 36–50. [7] D. √ El-Baz, J. Marklof, and I. Vinogradov, The two-point correlation function of the fractional parts of n is Poisson. Proc. Amer. Math. √ Soc. 143(2015), no. 7, 2815–2828. [8] N. Elkies and C. McMullen, Gaps in n mod 1 and ergodic theory. Duke Math. J. 123(2004), no. 1, 95–139. [9] P. Erdős and J. Koksma, On the uniform distribution modulo 1 of sequences (f(n, ϑ)). Nederl. Akad. Wetensch. Proc. 52(1949), 851–854. [10] A. Eskin, G. Margulis, and S. Mozes, Quadratic forms of signature (2, 2) and eigenvalue spacings on rectangular 2-tori. Ann. of Math. (2) 161(2005), no. 2, 679–725. [11] G. Harman, Metric number theory. London Mathematical Society Monographs New Series, 18, Clarendon Press, Oxford, UK, 1998. [12] D. R. Heath-Brown, Pair correlation for fractional parts of αn2 . Math. Proc. Cambridge Philos. Soc. 148(2010), no. 3, 385–407. [13] H. Iwaniec and E. Kowalski, Analytic number theory. Vol. 53, American Mathematical Society, Providence, RI, 2004. [14] A. Khintchine, Über einen Satz der Wahrscheinlichkeitsrechnung. Fund. Math. 6(1924), no. 1, 9–20. [15] L. Kuipers and H. Niederreiter, Uniform distribution of sequences. Courier Corporation, North Chelmsford, MA, 2012. [16] C. Lutsko, Long-range correlations of sequence modulo 1. Preprint, 2020. arXiv:2007.09292 [17] J. Marklof, The Berry–Tabor conjecture. In: C. Casacuberta, R. M. Miró-Roig, J. Verdera, and S. Xambó-Descamps (eds.), European Congress of Mathematics, Springer, New York, 2001, pp. 421–427. [18] J. Marklof, Pair correlation densities of inhomogeneous quadratic forms. Ann. of Math. (2) 158(2003), no. 2, 419–471. [19] H. Montgomery, Ten lectures on the interface between analytic number theory and harmonic analysis. CBMS Regional Conference Series in Mathematics, 84, American Mathematical Society, Providence, RI, 1994. [20] Z. Rudnick, Quantum chaos? Notices Amer. Math. Soc. 55(2008), no. 1, 32–34. [21] Z. Rudnick, A metric theory of minimal gaps. Mathematika 64(2018), no. 3, 628–636. [22] Z. Rudnick and P. Kurlberg, The distribution of spacings between quadratic residues. Duke J. Math. 100(1999), 211–242. [23] Z. Rudnick and P. Sarnak, The pair correlation function of fractional parts of polynomials. Comm. Math. Phys. 194(1998), no. 1, 61–70. [24] Z. Rudnick, P. Sarnak, and A. Zaharescu, The distribution of spacings between the fractional parts of n2 α. Invent. Math. 145(2001), no. 1, 37–57. [25] Z. Rudnick and A. Zaharescu, The distribution of spacings between fractional parts of lacunary sequences. Forum Math. 14(2002), no. 5, 691–712. Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use. 38 N. Technau and A. Walker [26] P. Sarnak, Values at integers of binary quadratic forms. In: Harmonic analysis and number theory (Montreal, PQ, 1996), CMS Conference Proceedings, 21, American Mathematical Society, Providence, RI, 1997, 181–203. [27] S. Steinerberger, Poissonian pair correlation and discrepancy. Indag. Math. 29(2018), no. 5, 1167–1178. [28] C. Weiı and T. Skill, Sequences with almost Poissonian pair correlations. Preprint, 2021. arXiv:1905.02760 [29] H. Weyl, Über die gleichverteilung von zahlen mod eins. Math. Ann. 77(1916), no. 3, 313–352. [30] A. Zaharescu, Correlation of fractional parts of n2 α. Forum Math. 15(2003), no. 1, 1–21. University of Wisconsin-Madison, Madison, USA e-mail: technau@wisc.edu Trinity College, Cambridge, UK e-mail: aw530@cam.ac.uk Downloaded from https://www.cambridge.org/core. 04 Jan 2022 at 01:58:45, subject to the Cambridge Core terms of use.