CORNERS IN DENSE SUBSETS OF Pd ÁKOS MAGYAR AND TATCHAI TITICHETRAKUN A BSTRACT. Let Pd be the d-fold direct product of the set of primes. We prove that if A is a subset of Pd of positive relative upper density then A contains infinitely many “corners”, that is sets of the form {x, x + te1 , ..., x + ted } where x ∈ Zd , t ∈ Z and {e1 , .., ed } are the standard basis vectors of Zd . The main tools are the hypergraph removal lemma, the linear forms conditions of Green-Tao and the transference principles of Gowers and Reingold et al. 1. I NTRODUCTION A remarkable result in additive number theory due to Green and Tao [8] proves the existence of arbitrary long arithmetic progressions in the primes. It roughly states that if A is a subset of the primes of positive relative upper density then A contains arbitrary constellations, that is non-trivial affine copies of any finite set of integers. It might be viewed as a relative version of Szemerédi’s theorem [20] on the existence of long arithmetic progressions in dense subsets of the integers. In higher dimensions, the multi-dimensional extension of Szemerédi’s theorem first proved by Furstenberg and Katznelson [4], which states that if A ⊆ Zd is of positive upper density then A contains non-trivial affine copies of any finite set F ⊆ Zd . The proof in [4] uses ergodic methods however a more recent combinatorial approach was developed by Gowers [5] and also independently by Nagel, Rödl and Schacht [15]. It is natural to ask if both results have a common extension, that is if the Furstenberg-Katznelson theorem can be extended to subsets of Pd of positive relative upper density, that is when the base set of integers are replaced by that of the primes. In fact, this question was raised by Tao [22], where the existence of arbitrary constellations among the Gaussian primes was shown. A partial result, extending the original approach of [8], was obtained earlier by B. Cook and the first author [3], where it was proved that relative dense subsets of Pd contain an affine copy of any finite set F ⊆ Zd which is in general position, meaning that each coordinate hyperplane contains at most one point of F . However when the set F is not in general position, it does not seem feasible to find a suitably pseudorandom measure supported essentially on the d-tuples of the primes, due to the self-correlations inherent in the direct product structure. For example, if we want to count corners {(a, b), (a + d, b), (a, b + d)} in A ⊆ P2 then if (a + d, b), (a, b + d) ∈ P2 then the remaining vertex (a, b) must also be in P2 . Thus the probability that all three vertices are in P2 (or in the direct product of the almost primes) is not (log N )−6 as one would expect, but roughly (log N )−4 , preventing the use of any measure of the form ν ⊗ ν. In light of this our method is different, based on the hypergraph approach partly used already in [22], where one reduces the problem to that of proving a hypergraph removal lemma for weighted uniform hypergraphs. The natural approach is to use an appropriate form of the so-called transference principle [6], [16] to remove the weights and apply the removal lemmas for “un-weighted” hypergraphs, obtained in [5], [15], [24]. This way our argument also covers the main result of [3] and in particular of [8]. Recently another proof of (one dimensional) Green-Tao theorem and the main result of [22], based on a removal lemma for uniform hypergraphs, has been given in [1]. An interesting feature of the argument there is that it only uses (weaker form) the so-called linear form condition of [8]. (Also in [2] uses only linear forms conditions.) 1 2 ÁKOS MAGYAR AND TATCHAI TITICHETRAKUN Recall that a set A ⊆ Pd has upper relative density α if lim sup N →∞ |A ∩ PdN | =α |PdN | Let us state our main result. Theorem 1.1. Let A ⊂ (PN )d with positive relative upper density α > 0 then A contains at least N d+1 (affine copies) corners for some (computable) constant C(α). C(α) (log N )2d As mentioned above, we will use the hypergraph approach which has been used to establish the the existence of corners (and then that of general constellations) in dense subsets Zd [5] [15], first observed in the simplest case in [19], where the key tool is the triangle removal lemma of Ruzsa and Szemerédi [17]. Theorem 1.2. (Triangles Removal Lemma [17]) If a graph of n vertices has at least δn2 edge-disjoint triangles for some 0 < δ < 1. Then it in fact contains at least c(δ)n3 triangles for some c(δ) > 0. In higher dimensions, there are hypergraph removal lemma (e.g.[15],[5], [24]) which follows from regularity lemma and counting lemma. In our weighted setting, this method allows us to distribute the weights for primes (using the Green-Tao measure ν [8], see appendix) so that we can avoid dealing with higher moments of the Green-Tao measure ν. We will define the notion and prove some facts for independent weight systems for which the weight systems related to corners is just a special case. The reason that we cannot handle more general constellations is that we don’t quite have a suitable removal lemma (e.g. Thm. 5.1) for general weight systems on non-uniform hypergraphs. Indeed for general constellations, our approach leads to a weighted hypergraph with weights possibly attached to any lower dimensional hyperedge, making it difficult to apply transference principle to remove weights. The proofs of general multidimensional Szemeredi’s Theorem in the primes are given in [2], [25], [14] using different method. The proof in [25] and [14] rely on “Infinite Linear Forms conditions” which rely on Gowers Inverse Norm Theorem [11]. In this paper, we exhibit a method of Transference Principle to prove a special case of corners. In particular we show how to prove weighted removal lemma (Theorem 5.3) from unweight removal lemma (Theorem 5.1). 1.1. Notation. [N ] := {1, 2, ..., N }, [M, N ] := {M, M + 1, ..., M + N }, PN := P ∩ [N ]. d Write x = (x1 , ..., xd ), y = (y1 , ..., yd ), ω ∈ {0, 1}d , let Pω : Z2d N → ZN be the projection defined by ( xj if ωj = 0 Pω (x, y) = u = (u1 , ..., ud ), uj = yj if ωj = 1 For each I ⊆ [d], xI = (xi )i∈I . We denote the j th coordinate of xI by (xI )j . We may denote x for x[d] when we work in ZdN . ωI means elements in {0, 1}|I| . Similarly we may write ω for ω[d] . We also define PωI (xI , yI ) in the same way. ω|I is the ω restricted Q to the index set I. For finite sets Xj , j ∈ [d], I ⊆ [d] then XI := j∈I Xj and ( Y Xi , ωI (i) = 0 PωI (XI , YI ) = Zi , Zi = Yi , ωI (i) = 1 i∈I If we want to fix on some position, we can write for example ω(0,[2,d]) means element in {0, 1}d such that the first position is 0. Also for each ω, define y1(ω) ∈ Zd by ( 0 if ωi = 0 (y1(ω) )i = , 1 ≤ i ≤ d. yi if ωi = 1 CORNERS IN DENSE SUBSETS OF Pd ( 1 (0(ω))i = 0 if if 3 ωi = 0 , 1 ≤ i ≤ d. ωi = 1 y0(ω) , 1(ω) ∈ Zd is also defined similarly. For any finite set X and f : X → R, and for any measure µ on X, Z 1 X 1 X f dµ := Ex∈X f (x) := f (x), f (x)µ(x) |X| |X| X x∈X x∈X Unless otherwise specified, the error term o(1) means a quantity that goes to 0 as N, W → ∞. 2. W EIGHTED HYPERGRAPHS AND BOX NORMS . 2.1. Hypergraph setting. First let us parameterize any affine copies of a corner as follow Definition 2.1. A non-degenerate corner is given by the following set of d−tuples of size d + 1 in Zd (or ZdN ): {(x1 , ..., xd ), (x1 + s, x2 , ..., xd ), ..., (x1 , ..., xd−1 , xd + s), s 6= 0} or equivalently, {(x1 , ..., xd ), (z − X xj , x2 , ..., xd ), (x1 , z − 1≤j≤d j6=1 with z 6= X xj , x3 , ..., xd ), ..., (x1 , ..., xd−1 , z − 1≤j≤d j6=2 X xj )} 1≤j≤d j6=d P 1≤i≤d xi Now to a given set A ⊆ ZdN , we assign a (d + 1)− partite hypergraph GA as follows: Let X1 = ... = Xd+1 := ZN be the vertex sets, and for j ∈ [1, d] let an element a ∈ Xj represent the hyperplane xj = a, and an element a ∈ Xd+1 represent the hyperplane a = x1 + · · · + xd . We join these d vertices (which represent d hyperplanes) if all of these d hyperplanes intersect in A. Then a simplex in GA corresponds to a corner in A. Note that this includes trivial corners which consist of a single point. For each I ⊆ [d + 1] let E(I) denote the set of hyperedges whose elements are exactly from vertices set Vi , i ∈ I. In order to count corners in A, we will place some weights on some of these hyperedges that will represent the coordinates of the corner. To be more precise we define the weights on 1−edges: νj (a) = ν(a), a ∈ Xj , j ≤ d, νd+1 (a) = 1, a ∈ Xd+1 , and on d−hyperedges: νI (a) = ν(ad+1 − X aj ), a ∈ E(I), |I| = d, d + 1 ∈ I j∈I\{d+1} ν[1,d] (a) = 1, a ∈ E([1, d]) We define measure spaces associated to our system of measure as follows. For 1 ≤ i ≤ d, let (Xi , dµXi ) = (ZN , ν) and let µXd+1 be the normalized counting measure on Xd+1 = ZN . In particular the weights are 1 or of the form νI (LI (xI )) where all linear forms {LI (xI )} are pairwise linearly independent. This is an example of something we call independent weight system. Definition 2.2 (Independent weight system). An independent weight system is a family of weights on the edges of a d + 1−partite hypergraph such that for any I ⊆ [d + 1], |I| ≤ d, νI (xI ) is either 1 or of the form QK(I) j j j=1 ν(LI (xI )) where all distinct linear forms {LI }I⊆[d+1], 1≤j≤K(I) are pairwise linearly independent, moreover the form LjI depends exactly on the variables xI = (xj )j∈I . 4 ÁKOS MAGYAR AND TATCHAI TITICHETRAKUN In fact for a weight system that arised from parametrizing affine copies of configurations in Zd , it is easy to see from the construction that for any I ⊆ [d + 1], |I| = d all distinct linear forms {LkJ }J⊆I,1≤k≤K(J) are linearly independent however we don’t need this fact in our paper. We define a measure on XI , I ⊆ [d + 1], |I| = d associated to an independent weight system by Z Y f dµXI := ExI f I · νJ (xJ ), XI J⊆I,|J|<d (that is we put only weights of order < d on hypergraph) as well as on X[d+1] by Z Y νI (xI ), f dµX[d+1] := Ex[d+1] f · X[d+1] I⊆[d+1],|I|<d and the associated multi-linear form, which will be used to estimate the numbers of prime configurations, by Z Y I Λ(f , |I| = d) := f I dµX[d+1] (2.1) X[d+1] |I|=d Note that we can also parameterize any configuration of the form {x, x + tv1 , . . . , x + tvd } in Pd using an appropriate independent weight system. Now for each I = [d + 1]\{j}, 1 ≤ j ≤ d let X f I = 1A (x1 , ..., xj−1 , xd+1 − xi , xj+1 , ..., xd ) · νI 1≤i≤d i6=j and for I = [d] let f I = 1A (x1 , ..., xd ). Hence we attach 1-weight to the hypergraph and d−weight to the function. This is a way we distribute the weights in order to apply transference principle. As the coordinates of a corner contained in Pd are given by 2d prime numbers, we have Λ = Ex[d+1] Y |I|=d fI d Y ν(xi ) = i=1 1 N d+1 2d Y X ν(pi ) i=1 pi ∈A,1≤i≤2d (pi )1≤i≤2d constitutes a corner log2d N |{number of corners in A}| N d+1 (ignoring W-trick here and assuming that ν(N ) ≈ log N for now). Indeed if Λ ≥ C1 then ≈ |{number of corners in A}| ≥ C2 N d+1 . log2d N 2.2. Basic Properties of Weighted Box Norm. In this section we describe the weighted version of Gowers’s uniformity norms on (d + 1)− partite hypergraph (called Box-norm) and the so-called Gowers’s inner product associated to the hypergraph GA endowed with a weight system {νI }I⊆ [d+1],|I|≤d . Definition 2.3. For each 1 ≤ j ≤ d, let Xj , Yj be finite set (in this paper we will define Xj = Yj := ZN ) with a weight system ν on X[d] × Y[d] . For f : X[d] → R, define Z Y 2d kf kν := f (Pω[d] (x[d] , y[d] ))dµX[d] ×Y[d] X[d] ×Y[d] ω [d] := Ex[d] Ey[d] Y f (Pω[d] (x[d] , y[d] )) ω[d] Y Y νI (PωI (xI , yI )) |I|<d ωI and define the corresponding Gowers’s inner product of 2d functions, Z D E Y d fω , ω ∈ {0, 1} := fω[d] (Pω[d] (x[d] , y[d] ))dµX[d] ×Y[d] ν X[d] ×Y[d] ω [d] CORNERS IN DENSE SUBSETS OF Pd := Ex[d] Ey[d] Y 5 Y Y fω[d] (Pω[d] (x[d] , y[d] )) ω[d] νI (PωI (xI , yI )) |I|<d ωI d So f, ω ∈ {0, 1}d ν = kf k2ν . Definition 2.4 (Dual Function). For f, g : ZdN → R define the weight inner product Z Y νI (xI ). hf, giν := f · g dµX[d] = Ex∈Zd f (x)g(x) N X[d] |I|<d Define the dual function of f by Df := Ey∈Zd Y N Y Y f (Pω (x, y)) νI (PωI (xI , yI )) |I|<d ωI 6=0 ω6=0 So d kf k2ν Y = Ex∈Zd f (x) N Y Y Y νI (xI ) Ey∈Zd f (Pω (x, y)) νI (PωI (xI , yI )) N |I|<d ω6=0 |I|<d ωI 6=0 = hf, Df iν It may not be clear immediately from the definition that k·kdν is a norm but this will follow from the following theorem whose statements and the strategies of the proof are similar to analogue theorem for unweight Gowers inner product. Y Theorem 2.1 (Gowers-Cauchy-Schwartz’s Inequality). | fω ; ω ∈ {0, 1}d | ≤ kfω kdν . ω[d] Proof. We will use Cauchy-Schwartz’s inequality and linear form condition. Write 1/2 Y Y E D d = Ex[2,d] ,y[2,d] νI (PωI (xI , yI )) fω ; ω ∈ {0, 1} d ν |I|<d,1∈I / ωI Ex1 ν(x1 ) Y fω(0,[2,d]) (x1 , Pω[2,d] (x[2,d] , y[2,d] )) ω[2,d] ν{1}∪I (x1 , PωI (xI , yI )) |I|<d−1,1∈I / 1/2 Y νI (PωI (xI , yI )) Y × Y |I|<d,1∈I / ωI Y Ey1 ν(y1 ) fω(1,[2,d]) (y1 , Pω[2,d] (x[2,d] , y[2,d] )) ω[2,d] Y ν{1}∪I (y1 , PωI (xI , yI )) |I|<d−1,1∈I / Applying the Cauchy Schwartz inequality in the x[2,d] , y[2,d] variables, one has |hfω ; ω ∈ {0, 1}d idν |2 ≤ A · B here, Y A = Ex[2,d] ,y[2,d] Y νI (PωI (xI , yI )) |I|<d,1∈I / ωI Y × Ex1 ,y1 ν(x1 )ν(y1 ) fω(0,[2,d]) (x1 , Pω[2,d] (x[2,d] , y[2,d] ))fω(0,[2,d]) (y1 , Pω[2,d] (x[2,d] , y[2,d] )) ω[2,d] × Y Y |I|<d−1,1∈I / ωI ν{1}∪I (x1 , PωI (xI , yI ))ν{1}∪I (y1 , PωI (xI , yI )) 6 ÁKOS MAGYAR AND TATCHAI TITICHETRAKUN D E = fω(0) (Pω (x[d] , y[d] )) dν (0) where fω̃ = f(0,ω̃∩[2,d]) for any ω̃[d] . And, Y Y B = Ex[2,d] ,y[2,d] νI (PωI (xI , yI )) |I|<d,1∈I / ωI Y fω(1,[2,d]) (x1 , Pω[2,d] (x[2,d] , y[2,d] ))fω(1,[2,d]) (y1 , Pω[2,d] (x[2,d] , y[2,d] )) × Ex1 ,y1 ν(x1 )ν(y1 ) ω[2,d] Y × Y ν{1}∪I (x1 , PωI (xI , yI ))ν{1}∪I (y1 , PωI (xI , yI )) |I|<d,1∈I / ωI D E = fω(1) (Pω (x[d] , y[d] )) dν (1) where fω̃ = f(1,ω̃∩[2,d]) for any ω̃[1,d] . Then, apply Cauchy-Schwartz’s inequality in (x[3,d] , y[3,d] ) variables in the same way to end up with Y ω̃ |hfω ; ω ∈ {0, 1}d idν |4 ≤ hfω [1,2] ; ω ∈ {0, 1}d idν ω̃[1,2] ∈{0,1}[1,2] Iterate this, apply Cauchy-Schwartz’s inequality consecutively in (x[4,d] , y[4,d] ), ..., (x[d,d] , y[d,d] ) variables, we end up with Y d |hfω ; ω ∈ {0, 1}d idν |2 ≤ hf ω , ..., f ω idν , f ω = fω ω[d] ≤ Y d kfω k2dν ω[d] Corollary 2.2. k·kdν is a norm for N is sufficiently large. Proof. First we show nonnegativity. By the linear forms condition, k1kν = 1 + o(1). Hence by the Gowers-Cauchy-Schwartz inequality, we have kf kdν & |hf, 1, ..., 1idν | ≥ 0 for all sufficiently large N . Now kf + gkdν = hf + g, ..., f + gidν = X ( f hhω1 , ..., hωd idν , hω = g d ω∈{0,1} ≤ X ,ω = 0 ,ω = 1 d khω1 kdν ... khωd kdν = (kf kdν + kgkdν )2 ω∈{0,1}d d d d Also it follows directly from the definition that kλf k2dν = λ2 kf k2dν . Since the norm are nonnegative, we have kλf kdν = |λ| kf kdν . 2.3. Weighted generalized von-Neumann inequality. The generalized von-Neumann inequality says that the average Λ := Λd+1,ν (f I , I ⊆ [d + 1], |I| = d), see equation (2.1), is controlled by the weighted box norm. We show this inequality in the general settings of an independent weight system. CORNERS IN DENSE SUBSETS OF Pd 7 Theorem 2.3 (Weighted generalized von-Neumann inequality). Let I ⊆ [d + 1], |I| = d, f I : XI → R. Let ν be an independent system of measure on X[d+1] that satisfies linear form conditions. Suppose f I are dominated by ν i.e. |f I | ≤ νI . Write f (i) = f [d+1]\{i} , then |Λd+1,ν (f (1) , ..., f (d+1) )| . min{kf (1) kdν , ..., kf (d+1) kdν } Proof. We will apply Cauchy-Schwartz inequality and the linear forms condition. The idea is to consider one of the variables say xj , as a dummy variable and write Λ := Exj (...)Ex[d+1]\{j} (...) then apply Cauchy Schwartz’s inequality to eliminate the lower complexity factors and use linear forms condition to control the extra factor gained. We do this repeatedly d times. Each application of Cauchy Schwartz’s inequality will cause a blow up in a variable then after successive application of CauchySchwartz’s inequality we can obtain an expression in the form of box norm. First apply Cauchy-Schwartz’s inequality in xd+1 variable to eliminate f (d+1) d Y Y Y (d+1) (i) |Λ| ≤ Ex[d] f (x[d] ) νI (xI )Exd+1 f (x[d+1]\{i} ) νI (xI ) i=1 |I|<d,d+1∈I / 1/2 νI (xI ) ν[d] (x[d] ) Y ≤ Ex[d] ν[d] (x[d] ) |I|<d,d+1∈I / |I|<d,d+1∈I Y |I|<d,d+1∈I / 1/2 d Y νI (xI ) × Exd+1 f (i) i=1 Y |I|<d,d+1∈I νI (xI ) Now by the linear forms condition on the face {X[d+1]\{d+1} }(as the linear forms defining an independent weight system are pairwise linearly independent), we have Y Ex[d] ν[d] (x[d] ) νI (xI ) = 1 + o(1), |I|<d,d+1∈I / hence Y |Λ|2 . Ex[d] ν[d] (x[d] ) νI (xI ) |I|<d,d+1∈I / × Exd+1 ,yd+1 d Y Y f (i) (x[d]\{i} , Pωd+1 (xd+1 , yd+1 )) i=1 ωd+1 ∈{0,1} Y Y νI (xI\{d+1} , Pωd+1 (xd+1 , yd+1 )) |I|<d ωd+1 ∈{0,1} d+1∈I Next we want to eliminate f (d) (x[d+1]\{d} ) ≤ ν[d+1]\{d} (x[d+1]\{d} ). Seperating the average in xd variable, write Y Y Y |Λ|2 . Ex[d+1]\{d} ,yd+1 ν[d+1]\{d} (x[d−1] , Pωd+1 (xd+1 , yd+1 )) νI (PωI∩{d+1} (xI , yI )) |I|<d,d∈I / ω{d+1}∩I ωd+1 ∈{0,1} × Y νI (xI )Exd d−1 Y f (i) (x[d]\{i} , Pωd+1 (xd+1 , yd+1 )) Y νI (PωI∩{d+1} (xI , yI )) · ν[d] (x[d] ) |I|<d,d∈I ωI∩{d+1} i=1 |I|<d d,d+1∈I / Y Again, by the linear forms condition on the face {Pωd+1 (X[d+1]\{d} , Y[d+1]\{d} )}ωd+1 ∈{0,1} , we have Y Y Y Y Ex[d+1]\{d} ,yd+1 ν[d+1]\{d} (x[d−1] , Pωd+1 (xd+1 , yd+1 )) νI (PωI∩{d+1} (xI , yI )) νI (xI ) |I|<d,d∈I / ω{d+1}∩I ωd+1 ∈{0,1} is 1 + o(1) = O(1) and hence |Λ|4 .Ex[d−2] ,xd ,yd ,xd+1 ,yd+1 Y ω[d,d+1] ν[d+1]\{d−1} (Pω[d,d+1] (x[d+1]\{d−1} , y[d+1]\{d−1} )) |I|<d d,d+1∈I / 8 ÁKOS MAGYAR AND TATCHAI TITICHETRAKUN Y × Y νI (Pω[d,d+1]∩I (xI , yI )) |I|≤d,d−1∈I / ω[d,d+1]∩I × Exd−1 d−2 Y Y Y ν[d] (Pωd (x[d] , y[d] )) νI (PωI∩[d,d+1] (xI , yI )) |I|≤d ω[d,d+1]∩I d−1∈I i=1 ω[d,d+1] × Y Y f (i) (Pω[d,d+1] (xI , yI )) Y ν[d+1]\{d} (Pωd+1 (x[d+1]\{d} , y[d+1]\{d} )) ωd+1 ωd Continue using Cauchy-Schwartz inequality in xd−1 , ..., x2 in a similar fashion , using that Y Y Y Ex[d+1]\{r} ,y[d+1]\{r} ν[d+1]\{r} (Pω[r+1,d+1] (x[d+1]\{r} , y[d+1]\{r} )) νI (Pω[r+1,d+1]∩I (xI , yI )) ω[r+1,d+1] |I|≤d ω[r+1,d+1]∩I r∈I / is 1+o(1) = O(1) by linear forms are on all faces {Pω[r+1,d+1] (X[d+1]\{r} , Y[d+1]\{r} )}ω[r+1,d+1] ∈{0,1}[r+1,d+1] . Eventually, we obtain Y Y Y d |Λ|2 . Ex[2,d+1] ,y[2,d+1] f (1) (Pω[2,d+1] (x[2,d+1] , y[2,d+1] )) νI (Pω[2,d+1]∩I (xI , yI )) ω[2,d+1] |I|<d,1∈I / ω[2,d+1]∩I × W (Pω[2,d+1] (x[2,d+1] , y[2,d+1] )) (2.2) where Y W := W (Pω[2,d+1] (x[2,d+1] , y[2,d+1] ); ω ∈ {0, 1}[2,d+1] ) := Ex1 Y νI (x1 , Pω[2,d+1]∩I (xI\{1} , yI\{1} )) |I|<d ω[2,d+1]∩I × d+1 Y Y ν[d+1]\{k} (Pω[2,d+1] (x[d+1]\{k} , y[d+1]\{k} )) k=2 ω[2,d+1]\{k} 2d Write the RHS of (2.2) = f (1) d + E ν where Y |E| ≤ Ex[2,d+1] ,y[2,d+1] f (1) (Pω[2,d+1] (x[2,d+1] , y[2,d+1] )) ω[2,d+1] Y ω[2,d+1] × Ex[2,d+1] ,y[2,d+1] νI (PωI (xI , yI )) × |W − 1| |I|<d,1∈I / ωI We wish to show that E = o(1). Taking square, Y |E|2 ≤ Ex[2,d+1] ,y[2,d+1] ν[2,d+1] (Pω[2,d+1] (x[2,d+1] , y[2,d+1] )) Y Y Y Y νI (PωI (xI , yI )) |I|<d,1∈I / ωI ν[2,d+1] (Pω[2,d+1] (x[2,d+1] , y[2,d+1] )) ω[2,d+1] Y Y νI (PωI (xI , yI ))|W − 1|2 |I|<d,1∈I / ωI The term on the first line is 1+o(1) by linear form condition on all the faces {Pω[2,d+1] (X[2,d+1] , Y[2,d+1] )}ω . [2,d+1]∈{0,1}[2,d+1] So we just need to show Y Y Y Ex[2,d+1] ,y[2,d+1] ν[2,d+1] (Pω[2,d+1] (x[2,d+1] , y[2,d+1] )) νI (PωI (xI , yI ))W = 1 + o(1) ω[2,d+1] |I|<d,1∈I / ωI (2.3) Ex[2,d+1] ,y[2,d+1] Y ω[2,d+1] ν[2,d+1] (Pω[2,d+1] (x[2,d+1] , y[2,d+1] )) Y Y νI (PωI (xI , yI ))W 2 = 1 + o(1) |I|<d,1∈I / ωI (2.4) (2.3) follows from linear form conditions on the faces X[d+1] × Y[2,d+1] . (2.4) follows from linear form conditions on the faces X[d+1] × Y[d+1] and we are done. CORNERS IN DENSE SUBSETS OF Pd 9 3. T HE DUAL FUNCTION ESTIMATE . Functions with bounded dual norm are something that can be described as obstruction to Gowers uniformity. Gowers [5] demonstrates how to derive transference principle using the norm whose its dual is an algebra norm. Dual of Gowers uniformity norm is not an algebra norm but by restricting to functions dominated by a pseudorandom measure, its dual satisfies some nice algebraic properties that will be useful to derive transference principle. One important property is the dual function condition stated in theorem 3.1 below. In this section we prove the that dual norm finite product (say K terms) of dual function DF such that F ≤ ν is bounded. Here we put a limit K(α) on the numbers of terms and obtain the uniform bound O(1) (via Linear Forms Conditions) which is sufficient in our applications of hypergraph removal lemma. In [8] does not have a restriction on the size of K but the bound is then not uniform , is of the form OK (1) (they apply Correlation condition in order to avoid infinite Linear forms Condition ). Theorem 3.1. For all K ≤ K(α) any independent measure system and any fixed J ⊆ [d + 1], |J| = d, let F1 , ..., FK : XJ → R, Fj (xJ ) ≤ νJ (xJ ) be given functions. Then for each 1 ≤ K ≤ K(α) we have that K Y ∗ DFj d = O(1) ν j=1 Proof. We will denote by I the subsets of a fixed set J ⊆ [d + 1], |J| = d. First, write Y Y Y Fj (Pω (x, y j )) νI (PωI (xI , yIj )) DFj (x) = Eyj ∈Zd N |I|<d ωI 6=0 ω6=0 Now assume kf kdν ≤ 1 then K K Y Y Y f, DFj ν = Ex∈Zd f (x) DFj (x) νI (xI ) N j=1 j=1 |I|<d = Ex∈Zd f (x)Ey1 ,...yK ∈Zd N K Y Y N j=1 ω6=0 j Fj (Pω (x, y )) Y Y |I|<d νI (PωI (xI , yIj )) νI (xI ) ωI 6=0 We will compare this to the box norm to exploit the fact that kf kdν ≤ 1. To compare this to the Gowers’s inner product, let us introduce the following change of variables: For a fixed y ∈ ZdN , write y j 7→ y j + y for 1 ≤ j ≤ K then our expression takes the form K K Y Y Y Y Y f, DFj ν = Ey1 ,...,yK Ex f (x) Fj (Pω (x, y + y j )) νI (PωI (xI , yIj + yI )) νI (xI ) j=1 j=1 ω6=0 |I|<d ωI 6=0 Since ZdN is cyclic. This is equal to the average K Y Y Y Y j j Ey1 ,...,yK ∈Zd Ex,y∈Zd f (x) Fj (Pω (x, y + y )) νI (PωI (xI , yI + yI )) νI (xI ) N N j=1 |I|<d ωI 6=0 ω6=0 For ω ∈ {0, 1}d , Y = (y 1 , . . . , y k ) ∈ (ZdN )k . We will define functions Gω,Y (x) : ZdN → R such that f, K Y j=1 DFj ν D E = Ey1 ,..,yK Gω,Y ; ω ∈ {0, 1}d dν 10 ÁKOS MAGYAR AND TATCHAI TITICHETRAKUN To do this, let G0 (x) := f (x) and for each ω̃ 6= 0, Y , define Y 1 K Y Y 2d−|I| − 1 j j Gω̃,Y (x) := Fj (x + y1(ω̃) ) νI ((x + y1(ω̃) ) I ) νI (xI ) 2d−|I| × j=1 |I|<d |I|<d Hence for ω̃ 6= 0 Y K Y Y − 1 1 j 2d−|I| j νI (Pω̃ (x, y)I ) 2d−|I| νI ((Pω̃ ((x, y+y ) I × Gω̃,Y (Pω̃ (x, y)) = Fj (Pω̃ (x, y+y )) j=1 |I|<d |I|<d Remark 3.1. For each I ⊆ [d] and fixed ωI , the number of ω[d] such that ω[d] |I = ωI is 2d−|I| and Pω (x, y)|I = PωI (xI , yI ) ⇐⇒ ω|I = ωI Hence D E Gω,Y ; ω ∈ {0, 1}d dν = Ex,y∈Zd N × Y = Ex,y∈Zd N Y Gω,Y (Pω (x, y)) × ω Y Y νI (PωI (xI , yI )) |I|<d ωI K YY Y 1 j j d−|I| 2 Fj (Pω (x, y + y ))( νI (Pω ((x, y) + y1(ω) ) I ) ω j=1 |I|<d νI (Pω (xI , yI )|I ) 1 − d−|I| 2 × Y Y νI (PωI (xI , yI )) |I|<d ωI |I|<d = Ex,y∈Zd N Y Y Y K Y K Y j j Fj (Pω (x, y + y )) × νI (PωI (xI , yI + yI )) νI (xI ) f (x) j=1 ω6=0 j=1 ωI 6=0 |I|<d Hence we have hf, K Y E D DFj iν = Ey1 ,..,yK Gω ; ω ∈ {0, 1}d j=1 dν Then by Gowers-Cauchy-Schwartz’s and arithmetic-geometric mean inequality, we have K Y X Y d hf, DFj iν ≤ kf kdν kGω,Y kd . 1 + kGω,Y k2d ν j=1 ν ω[d] 6=0 ω6=0 Hence to prove the dual function estimate, it is enough to show that d Ey1 ,...,yK kGω̃,Y k2d = O(1) ν For any fixed ω̃ 6= 0. Now d Ey1 ,...,yK kGω̃,Y k2d = Ey1 ,...,yK Ex,y Y ν Gω̃,Y (Pω (x, y)) ω Y Y νI (PωI (xI , yI )) |I|<d ωI K Y YY 1 j j ≤ Ey1 ,...,yK Ex,y ν[d] (Pω (x, y) + y1(ω̃) ) νI ((Pω (x, y) + y1(ω̃) )I ) 2d−|I| ω j=1 × Y |I|<d Y Y − 1 νI (Pω (x, y)I ) 2d−|I| × νI (PωI (xI , yI )) |I|<d ωI |I|<d = Ey1 ,...,yK Ex,y K Y Y j=1 ω ν[d] (Pω ((x, y) + j )) y1(ω̃) I Y Y |I|<d ωI νI (PωI (xI , yI ) + j ) y1(ω̃) I CORNERS IN DENSE SUBSETS OF Pd 11 by remark 1 above. As the linear forms appearing in the above expression are pairwise linearly independent this is O(1) by the linear forms condition as required. 4. T RANSFERENCE P RINCIPLE In this section, we will slightly modify the transference principle in [6] (see Theorem 4.6 below), which will allow us to deduce results for functions dominated by a pseudo-random measure from the corresponding result on bounded functions. The main difference between this paper and of [8] is that the dual function may not be bounded. We will do this on the set on which our functions have bounded dual, and treat the contributions of the remaining set as error terms. We will need the explicit description of the set Ω(T ) that the dual function is bounded by T using the correlation condition (see appendix). In general, T will depend on and T → ∞ as → 0 but when we apply removal lemma we will choose to be some small number depending on α and hence if α is fixed then we can regard as a fixed small constant and T as a fixed large constant. We will work on functions f : XI → R, dominated by νI . WLOG I = [d]. Let h·i be any inner product R on F := {f : X[d] → R} written as hf, gi = f · g dµ for some measure µ on X[d] . 4.1. Dual Boundedness on XI . One property of the dual functions that is used in [8] is their boundedness. However in the weighted settings, this is generally not true. To get around this, we will be working on sets on which the dual functions are bounded and treat the contributions of the remaining parts as error terms. Consider any independent weight system. Let I ⊆ [d + 1], |I| = d, f : XI → R, |f | ≤ νI (WLOG I = [d]). Recall Y Y Y Df = Ey f (Pω (x, y)) νI (PωI (xI , yI )) |I|<d ωI 6=0 ω6=0 Write hωI = LI (x)|0(ωI ) |Df | ≤ hence using correlation condition (see appendix), we have Y X τ (W · (aωI1 hωI1 − aωI2 hωI2 ) + (aωI1 − aωI2 )b) (4.1) ∅6=J⊆[d] (ωI1 ,ωI2 )∈TJ where for each J ( [d], J 6= ∅ TJ := {{ωI1 , ωI2 }, ωI1 , ωI2 6= 0, ωI1 6= ωI2 , 1(ωI1 ) = 1(ωI2 ) = J : ∃c ∈ Q, LI1 (y1(ωI1 ) ) = cLI2 (y1(ωI2 ) )} where aωIj ∈ Q are some constants (in our case of corners, they will be integers). Define X d ΩJ (T ) = {(x[d] : τ (W · (aωI1 hωI1 − aωI2 hωI2 ) + (aωI1 − aωI2 )b)) ≤ T 1/2 } (4.2) {ωI1 ,ωI2 }∈TJ Ω(T ) = \ ΩJ (T ) J([d] So Df is bounded by T on Ω(T ) for any fixed T > 1. 4.2. Transference principle. Definition 4.1. For each T > 1 we have th set Ω(T ) and define the following sets F := {f : X[d] → R} FT := {f ∈ F : supp(f ) ⊆ Ω(T )} ST := {f ∈ FT : |f | ≤ ν[d] (x[d] ) + 2} We define the following (basic anti-correlation) norm on FT kf kBAC := max | hf, Dgi | g∈ST We have the following basic properties of this norm. (4.3) 12 ÁKOS MAGYAR AND TATCHAI TITICHETRAKUN Proposition 1. (1) g ∈ FT ⇒ Dg ∈ FT (2) k·kBAC is a norm on FT and can be extended to be a seminorm on F. Furthermore, we have kf kBAC = f · 1Ω(T ) BAC , f ∈ F. (3) Span{Dg : g ∈ ST } = FT P P (4) kf k∗BAC = inf{ ki=1 |λi |, f = ki=1 λi Dgi ; gi ∈ ST } for f ∈ FT P Remark 4.1. If f ∈ / FT then supp(f ) * Ω(T ) so f is not of the form ki=1 λi Dgi ; gi ∈ FT as RHS is zero. (1) Suppose (x̃1 , ..., x̃d ) ∈ Ω(T )C then there is an J ( [d] such that KJ (x̃[d]\J ) > T where KJ is the function in the definition of ΩJ (T ) for some J. Let g ∈ FT then g(x̃[d]\J , xJ ) = 0 for all xJ ∈ XJ So Dg(x̃[d]\J , xJ ) = g(x̃[d]\J , xJ )E(x) = 0 for some function E so Dg ∈ FT . (2) It follows directly from the definition that kf + gkBAC ≤ kf kBAC + kgkBAC and kλf kBAC = |λ| kf kBAC for any λ ∈ R. Now suppose f ∈ FT , f is not identically zero then we need to show that kf kBAC 6= 0. Since X and Z are finite sets, we have that kf k∞ = maxx,z |f (x, z)| < ∞. Let g = γf where γ is a constant such that kgk∞ < 2 then g ∈ ST and hf, Dgi = hf, Dγf i = d γ 2 −1 hf, Df i > 0 so kf kBAC > 0 Now supp(Dg) ⊆ Ω(T ) we have for any f ∈ F kf k = sup | hf, Dgi | = sup | f · 1Ω(T ) , Dg | = f · 1Ω(T ) Proof. BAC BAC g∈ST g∈ST (3) If there is an f ∈ FT , f is not identically zero and f ∈ / span{Dg : g ∈ ST } So f ∈ span{Dg : g ∈ ⊥ ST } then hf, Dgi = 0 for all g ∈ ST . So kf kBAC = 0 which is a contradiction. P P (4) Define kf kD = inf{ ki=1 |λi | : f = ki=1 λi Dgi , gi ∈ ST } which can be easily verified to be a P norm on FT . Now let φ, f ∈ FT , f = ki=1 λi Dgi , gi ∈ ST , then | hφ, f i | = k X |λi || hφ, Dgi i | ≤ kφkBAC i=1 k X |λi | ≤ kφkBAC kf kD i=1 so kf k∗BAC ≤ kf kD Next for all g ∈ ST , we have kDgkD ≤ 1 then kf kBAC = sup | hf, Dgi | ≤ sup | hf, hi | = kf k∗D g∈ST so kf kBAC ≤ kf k∗D i.e. kf k∗BAC khkD ≤1 ≥ kf kD . So kf k∗BAC = kf kD . Now let us prove the following lemma whose proof relies on the dual function estimate. From here we consider our inner product h·iν and the norm k·kν . This argument also works for any norm for which one has the function has bounded dual norm. Lemma 4.1. Let φ ∈ FT be such that kφk∗BAC ≤ C and η > 0. Let φ+ := max{0, φ}. Then there is a polynomial P (u) = am um + ... + a1 u + a0 such that (1) kP (φ) − φ+ k∞ ≤ η (2) kP (φ)k∗dν ≤ RP,T (C) P j + where P (x) = m j=0 aj x is a polynomial such that |P (x) − x | ≤ for all x ∈ [−CT, CT ] and RP,T is the polynomial m X RP,T (x) = K|aj |xj j=0 CORNERS IN DENSE SUBSETS OF Pd 13 here K is the constant in the dual function estimates, whuch may be taken to be something like 2 if N is sufficiently large but let us just leave it as K for now. Remark 4.2. Note that it is possible to choose P so that RP,T ( 1 ) = exp(( T )−O(1) ). Proof. Suppose kφk∗BAC ≤ C then there exist g1 , .., gk ∈ ST and λ1 , ..., λk such that φ = P 1≤i≤k |λi | ≤ C. Hence |φ(x1 , ..., xd )| ≤ ( k X Pk i=1 λi Dgi and |λi |)( max |Dgi (x1 , .., xd )|) ≤ CT i=1 1≤i≤k Hence the Range of φ = φ(Ω(T )) ⊆ [−CT, CT ]. Then by Weierstrass approximation theorem, there is a polynomial P (which may depend on C, T, η) such that |P (u) − u+ | ≤ η and so kP (φ) − φ+ k∞ ≤ η and we have (1). Now using the dual function estimate, we have ∗ X j ∗ φ d ≤ ( λi Dgi )j d ≤ ν ν 1≤i≤k X ≤K ∀|u| ≤ CT Hence kP (φ)k∗dν ≤ j=0 |am |KC m ν 1≤i1 ≤...≤ij ≤k |λi1 ...λij | ≤ K( 1≤i1 ≤...≤ij ≤k Pm ∗ |λi1 ...λij | Dgi1 ...Dgij d X X |λi |)j ≤ KC j 1≤i≤k ≤ RP,T (C) Now we are ready to prove the transference principle. Theorem 4.2. Suppose ν is a pseudorandom independent weight system. Let f ∈ F and 0 ≤ f (x[d] ) ≤ ν[d] (x[d] ), let η > 0. Suppose N ≥ N (η, T ) is large enough, then there are functions g, h on X1 × ... × Xd such that (1) f = g + h on Ω(T ) (2) 0 ≤ g ≤ 2on Ω(T ) (3) h · 1Ω(T ) d ≤ ν To prove this theorem,we have h · 1Ω(T ) ∈ ST and d kh · 1Ω(T ) kBAC ≥ hh · 1Ω(T ) , D(h · 1Ω(T ) )iν = kh · 1Ω(T ) k2d ν so it suffices to show Theorem 4.3. With the same assumption in Theorem 4.2, there are functions g, h such that (1) f = g + h on Ω(T ) (2) 0 ≤ g ≤ 2on Ω(T ) (3) h · 1Ω(T ) BAC ≤ Here the BAC-norm is the BAC-norm with respect to h·iν The following lemma will be used in the next proof. Lemma 4.4 (Hahn-Banach’s Theorem see e.g. corollary 3.2 in [6]). Let K1 , ..Kr be closed convex subsets of Rd , each containing 0 and suppose f ∈ Rd cannot be written as a sum f1 + ... + fr , fi ∈ ci Ki , ci > 0. Then there is a linear functional φ such that hf, φi > 1 and hg, φi ≤ c−1 i for all i ≤ r and all g ∈ Ki . Proof of Theorem 4.3: Define K := {g ∈ F : 0 ≤ g ≤ 2 on Ω(T )} L := {h ∈ F : khkBAC ≤ } 14 ÁKOS MAGYAR AND TATCHAI TITICHETRAKUN Then it is clear that K, L are convex.(Also 0 ∈ K, 0 ∈ Int(L) and then 0 ∈ Int(K + L).) Assume that f∈ / K + L on Ω(T ) then by Lemma 4.4, there exists φ ∈ F such that (1) φ, f · 1Ω(T ) ν > 1 (2) hφ, giν ≤ 1 ∀g ∈ K (3) hφ, hiν ≤ 1 ∀h ∈ L First, we claim that φ ∈ FT . To see this, suppose g is a function whose supp(g) ⊆ Ω(T )C i.e. g ≡ 0 on C Ω(T ) so g ∈ K. Since g ∈ K, hφ, giν ≤ 1 but g could be chosen arbitrarily on Ω(T ) so we must have φ Ω(T )C ≡ 0 and hence φ ∈ FT . Now let ( 2 if φ(x[d] ) ≥ 0 g(x[d] ) = 0 otherwise then g ∈ K and hφ, giν = hφ+ , 2iν = 2 hφ+ , 1iν ≤ 1 ⇒ hφ+ , 1iν ≤ 1 2 Now since φ ∈ FT , h ∈ L. Suppose kh · 1Ω(T )C kBAC ≤ 1 then we have Hence if h0 ∈ FT and kh0 kBAC hφ, h · 1Ω(T )C iν = hφ, hiν ≤ −1 . ≤ 1 then h0 · 1Ω(T ) BAC = kh0 kBAC ≤ 1 so hφ, h0 iν ≤ −1 ∀h0 ∈ FT , h0 BAC ≤ 1 so kφk∗BAC ≤ −1 as k·kBAC is a norm on FT . Now by the Lemma 4.1, there is a polynomial P such that kP (φ) − φ+ k∞ ≤ 1 8 and kP (φ)k∗dν ≤ RP,T (C) Then hP (φ), 1iν ≤ hP (φ) − φ+ , 1iν + hφ+ , 1iν ≤ 1 2 + 1 8 Also, from the linear form condition, we have d kν[d] (x[d] ) − 1k2d = oN →∞ (1) ν so suppose N ≥ N (T, η) so that d kν[d] (x[d] ) − 1k2d ≤ ν then 1 8RP,T (C) 1 1 ∗ d≤1+1=3 = hP (φ), 1i + P (φ), ν − 1 ≤ + + kP (φ)k ν − 1 d [d] [d] ν ν ν ν ν 2 8 2 4 4 1 1 3 | ν[d] , φ+ ν | = | ν[d] , φ+ − P (φ) ν |+| ν[d] , P (φ) ν | ≤ kφ+ − P (φ)k∞ ν[d] , 1 ν + ν[d] , P (φ) ν ≤ · + 8 2 4 Hence 3 1 f · 1Ω(T ) , φ ν = hf, φiν ≤ hf, φ+ iν ≤ ν[d] , φ+ ν ≤ + <1 4 10 which is a contradiction. Hence f ∈ K + L on Ω(T ). P (φ), ν[d] Now we can rephrase Theorem 4.3 as follow: Theorem 4.5 (Transference Principle). Suppose ν is an independent weight system with kν − 1kdµ ≤ 0 = exp(−( T1 )O(1) ). Let f ∈ F, 0 ≤ f ≤ ν and 0 < η < 1 T then there exists f1 , f2 , f3 ∈ F such that (1) f = f1 + f2 + f3 CORNERS IN DENSE SUBSETS OF Pd (2) 0 ≤ f1 ≤ 2, supp(f1 ) ⊆ Ω(T ) (3) kf2 kdν ≤ , supp(f2 ) ⊆ Ω(T ) (4) 0 ≤ f3 ≤ ν, supp(f3 ) ⊆ Ω(T )C , kf3 kL1ν . 15 1 T. Proof. Let g, h be as in Theorem 4.3. Take f1 = g · 1ΩT , f2 = h · 1ΩT then f · 1ΩT = f1 + f2 . Let f3 = f · 1ΩC . Now by linear form condition T kf3 kL1ν ≤ 1 Ex f · Df · T [d] Y νI (xI ) I⊆[d],|I|<d Y Y Y 1 1 = Ex[d] Ey[d] νI (xI ) νI ((PωI (xI , yI ))) . T T I⊆[d] ωI 6=0 I⊆[d] Remark 4.3. We will choose here to be as in the removal lemma (Theorem 5.3) and T = T () to be δ()−1 for the δ() in the removal lemma. 5. R ELATIVE H YPERGRAPH R EMOVAL L EMMA First let us recall the statement of a version of functional hypergraph removal lemma [24].1 Recall the definition of Λ in equation (2.1). Theorem 5.1. Given finite measure spaces (X1 , µX1 ), ..., (Xd+1 , µXd+1 ) and f (i) : XI → [0, 1], I = [d + 1]\{i} Let > 0, suppose |Λd+1 (f (1) , ..., f (d) , f (d+1) )| ≤ . Then for 1 ≤ i ≤ d, there exists Ei ⊆ X[d+1]\{i} such that Q 1≤j≤d+1 1Ej ≡ 0 and for 1 ≤ i ≤ d + 1, Z Z ··· f (i) · 1E C dµX1 · · · dµXd dµXd+1 ≤ δ() X1 Xd+1 i where δ() → 0 as → 0. The proof of removal lemma relies on functional version of Szemerédi’s Regularity Lemma [24]. If B is a finite factor of X i.e. a finite σ−algebra of measurable sets in X, then B is a partition of X into atoms A1 , ..., AM . Let f : X → R be measurable then we define the conditional expectation E(f |B) : X → R is R defined by E(f |B)(x) = (1/|Ai |) Ai f (x)dµX if x ∈ Ai (defined up to set of measure zero). We say that B has complexity at most m if it is generated by at most m sets. If BX is a finite factor of X with atoms A1 , ..., AM and BY is a finite factor of Y with atoms B1 , ..., BN then BX ∨ BY is a finite factor of X × Y with atoms Ai × Bj , 1 ≤ i ≤ M, 1 ≤ j ≤ N. Theorem 5.2 (Szemerédi’s Regularity Lemma [24]2). Let f : X[d] → [0, 1] be measurable, let τ > 0 and F : N → N be arbitrary increasing functions (possibly depends on τ ). Then there is an integer M = OF,τ (1), factors BI (I ⊆ [d], |I| = d − 1) on XI of complexity at most M such that f = f1 + f2 + f3 where W • f1 = E(f | I⊆[d],|I|=d−1 BI ). • kf2 kL2ν ≤ τ. 1In fact the paper [24] proves this theorem only with the counting measure (with thenotion of e−discrepancy in place of Box norm). But the proof also works for any finite measure that has direct product structure (with the notion of weighted Box Norm).(see [21] for the case of probability measures in d = 2, 3). However we don’t know how to genralize this argument to arbitrary measure on the product space. If we can prove this theorem for any measure µX1 ×...×Xd then we would be able to prove multidimensional Green-Tao’s Theorem. 2This theorem is proved for counting measure in [24] but the proof would work for any product measure on the product spaces. 16 ÁKOS MAGYAR AND TATCHAI TITICHETRAKUN • kf3 kdν ≤ F (M )−1 . • f1 , f1 + f2 ∈ [0, 1]. Remark 5.1. A consequence from this lemma that we will use later is the following: since f1 is a constant W on each atom of I.|I|=d−1 BI , we can decompose f1 as a finite sum of O(M ) = OF,τ (1) terms of lower Q complexity functions i.e.a finite sum of product di=1 Ji where Ji is a function in x[d]\{i} variable and takes values in [0, 1]. One advantage of exploting conditional expectation is that it is easy to see that f1 is positive. Our main goal is to prove the following version of weighted removal lemma using Theorem 5.1 and transference principle. (i) Theorem 5.3 (Weighted Simplex-Removal Lemma). Q Suppose 0 ≤ f (x[d+1]\{i} ) ≤ ν[d+1]\{i} (x[d+1]\{i} ). Let > 0, Suppose |Λ| ≤ then there exist Ei ⊆ j∈[d+1]\{i} Xj such that for 1 ≤ i ≤ d + 1, Y 1Ei ≡ 0 • Ri∈[d+1] R • X1 · · · Xd+1 f (i) 1E C dµX1 · · · dµXd+1 ≤ δ() i where δ() → 0 as → 0. Proof. Using the transference principle (Theorem 4.6) for 1 ≤ i ≤ d + 1, write f (i) = g (i) + h(i) + k (i) where (1) f (i) = g (i) + h(i) + k (i) (2) 0 ≤ g(i) ≤ 2, supp(g (i) ) ⊆ Ω(i) (T ) (3) h(i) d ≤ , supp(h(i) ) ⊆ Ω(i) (T ) ν (4) k (i) = f (i) · 1(Ω(i) )C (T ) where Ω(i) (T ) = {x[d+1]\{i} : |Df (i) | ≤ T }, 1 ≤ i ≤ d Step 1: We’ll show that if T ≥ T () is sufficiently large then Λd+1 (g (1) + h(1) , ..., g (d+1) + h(d+1) ) = Λd+1 (f (1) − k (1) , ..., f (d+1) − k (d+1) ) . . Proof of Step 1: For I ⊆ [d + 1], the term on LHS can be written as a sum of the following terms: ( −k (i) if i ∈ I Λd+1,I (e(1) , ..., e(d) , e(d+1) ), e(i) = f (i) if i ∈ /I 6 ∅ then If I = ∅ then Λd+1 (f (1) , ..., f (d) , f (d+1) ) ≤ by the assumption. Suppose I = {i1 , ..., ir } = Z Z Y (1) (d) (d+1) (1) (d+1) ··· f ···f · 1(Ω(i) )C dµX1 · · · dµXd+1 |Λd+1,I (e , ..., e , f )| = X1 ≤ Ex[d+1] Xd+1 i∈I Y νI (xI )1(Ω(i1 ) )C I⊆[d+1],|I|≤d ≤ . 1 Ex Ey T d+1 [d+1]\{i1 } Y νI (xI ) I⊆[d+1],|I|≤d Y ωI 6=0 I⊆[d+1]\{i1 } 1 T by linear form condition. Step 2 We’ll show Λd+1 (g (1) , ..., g (d+1) ) . if N ≥ N (). Proof of step 2: Write g (i) = g (i) + h(i) − h(i) = f (i) · 1Ω(i) (T ) − h(i) then we have 0 ≤ f (i) · 1Ω(i) (T ) ≤ νi , kh(i) kdν ≤ νI (PωI (xI , yI )) CORNERS IN DENSE SUBSETS OF Pd 17 so by the weighted von-Neumann inequality and step 1 , we have |Λd+1 (g (1) , ..., g (d+1) )| = |Λd+1 (g (1) + h(1) , ..., g (d+1) + h(d+1) ) − Λd+1 (h(1) , .., h(d) , h(d+1) )| . + oN →∞ (1) . if τ ≤ , N ≥ N () and the proof of step 2 is completed. Now since 0 ≤ g (i) ≤ 2 then (after normalizing) using the unweight hypergraph removal lemma (Theorem 5.1), we have Y F(i) ⊆ X[d+1]\{i} , F(i) ∈ B[d+1]\{i} , compl(B[d+1]\{i} ) ≤ M such that 1Fk ≡ 0 and 1≤k≤d+1 Z Z ··· X1 Xd+1 g (i) · 1F C dµX1 · · · dµXd +1 . δ() i so Z Z f (i) · 1F C dµX1 · · · dµXd+1 . δ() + ··· X1 Z i Xd+1 Z ··· X1 Xd+1 h(i) · 1F C dµX1 · · · dµXd dµXd+1 + i | Z {z Z ··· + X1 } (A) Xd+1 f (i) · 1ΩC (T ) 1F C dµX1 · · · dµXd+1 i i {z | (B) } Now for our purpose, it suffices to show (A), (B) . . Estimate for (A): By the regularity lemma 3, the function 1F C could be written as a sum of OF,τ (1) i Q (i) (i) of functions of the form j∈[d+1]\{i} uj where uj is a [0, 1]- valued function in x[d+1]\{i,j} . Hence it Q (i) suffices to consider functions of the form functions of the form j∈[d+1]\{i} uj in place of 1F C . We have i the following easy lemma which roughly states that Gowers uniform functions are orthogonal to lower order functions (in particular, it is uniformly distributed across lower order sets.) (i) Lemma 5.4. For h(i) : X[d+1]\{i} → R and uj be as above. Then Z 2d Z Z Y (i) (i) ··· h uj dµXd+1 dµX1 · · · dµXd dµXd+1 ≤ khkdµ X1 Xd Xd+1 1≤j≤d+1 j6=i Proof of Lemma. WLOG assume i < d + 1. Applying Cauchy-Schwartz’s inequality in (x1 , . . . xd )− vari(i) able and using that ud+1 is bounded, we have Z 2 2d−1 Z Z Y (i) (i) i ··· h uj dµXd+1 ud+1 dµX1 · · · dµXd dµXd+1 X1 Xd Z Z Z ≤ Xd+1 ··· X1 × Z ··· X1 Y (i) h Xd Z 1≤j≤d j6=i Xd+1 (i) uj dµXd+1 2 dµX1 · · · dµXd 1≤j≤d j6=i 2 uid+1 dµX1 2d−1 · · · dµXd Xd 3We need this since we don’t have something like kf gk ν ≤ kf kν kgkν 18 ÁKOS MAGYAR AND TATCHAI TITICHETRAKUN Z Z Z Z h(i) (x[d+1]\{i} , xd+1 )h(i) (x[d]\{i} , yd+1 ) ··· . Y Yd+1 Xd+1 Xd X1 (i) (i) uj (x[d]\{i} , xd+1 )uj (x[d]\{i} , yd+1 )dµX1 2d−1 · · · dµXd+1 dµYd+1 1≤j≤d j6=i (i) Continue apply Cauchy-Schwartz’s inequality in this way d − 1 more times to eliminate each uj we end up with Z Z Y d h(i) (PωI (xI , yI ))dµXI dµYI = khk2d XI µ YI ω I as required Now we apply the lemma we can estimate the expression (A): Z 2d Z Z (i) ··· h · 1F C dµX1 · · · dµXd dµXd+1 i X1 Xd Xd+1 Z Z Z (i) ··· . X1 h Xd Xd+1 (i) uj dµXd+1 Y 2d dµX1 · · · dµXd dµXd+1 1≤j≤d+1 j6=i d ≤ khk2d ≤ µ as required. Estimate for (B) : Ignoring 1F C term, we have i Z Z (i) ··· f · 1(Ω(i) (T ))C · 1F C dµX1 · · · dµXd+1 i X1 Xd+1 Z Z ≤ ··· (ν[d+1]\{i} ) · 1(Ω(i) (T ))C dµX1 · · · dµXd+1 X1 ≤ Xd+1 1 Ex Ey ν (x ) T [d+1]\{i} [d+1]\{i} [d+1]\{i} [d+1]\{i} Y |I|≤d,i∈I / Y νI (xI ) νI (PωI (xI , yI )) . ωI 6=0 I⊆[d+1]\{i} 1 , T by the linear forms condition. Hence if we choose sufficiently large T e.g. T = δ()−1 then Z Z ··· f (i) · 1F C dµX1 · · · dµXd+1 . δ(). X1 Xd+1 i 6. PROOF OF THE MAIN RESULT ) 6.1. From ZN to Z. Now recall that νδ1 ,δ2 (n) ≈ φ(W W log N, δ1 N ≤ n ≤ δ2 N, δ1 , δ2 ∈ (0, 1] for a d sufficiently large prime N in the residue class b (mod W ). By pigeonhole principle choose a b ∈ (Z× W) such that Nd |A ∩ (W Z)d + b| ≥ α (logd N )φ(W )d Now consider Ab = {n ∈ [1, N/W ]d : W n + b ∈ A} and let δ2 ∈ (0, 1) then by the Prime Number N Theorem there is a prime N 0 such that δ2 N 0 = (1 + δ) W for arbitrarily small real number δ. Then if N is CORNERS IN DENSE SUBSETS OF Pd 19 sufficiently large and δ is sufficiently small with respect to α then αδ2d (N 0 W )d 2 (logd N 0 )φ(W )d |Ab ∩ [1, δ2 N 0 ]d | ≥ (6.1) On the other hand by Dirichlet’s theorem on primes in arithmetic progressions, the number n ∈ [1, N 0 ]d \[δ1 N 0 , N 0 ]d 0 W )d for which W n + b ∈ Pd is ≤ cd 1 log(N Hence the estimate (6.1) holds for A0 := AW ∩ [δ1 N 0 , δ2 N 0 ]d d N 0 φ(W )d as well provided that δ1 is small enough. Now we may consider A0 in place of A (we are working in the group ZdN 0 instead). Now if we identify 0 0 the group ZdN 0 with [− N2 , N2 ]d then for a sufficiently small δ1 , δ2 any points in A0 are the same when we 0 0 change from ZdN 0 to [− N2 , N2 ]d (no wrap around issue). 6.2. Proof of the Main Theorem. To prove the theorem, suppose on the contrary that A0 contains less than N 0d+1 corners.( = c(α)) then (log N 0 )2d Λd+1 (f (1) , ..., f (d+1) ) X Y X 1A0 (x1 , ..., xi−1 , xd+1 − = (N 0 )−(d+1) xj , xi+1 , ..., xd )νI 1A0 (x1 , ..., xd ) · ν(x1 )...ν(xd ) x[d+1] 1≤i≤d ≤ . 1 N 0d+1 1 X 1≤j≤d j6=i Y 1A0 (p1 , ..., pk−1 , pd+k , pk+1 , ..., pd )1A (p1 , ..., pd )ν(p1 )...ν(p2d ) 1≤k≤d pi ∈A0 ,1≤i≤2d that consitutes a corner 0 2d N 0d+1 φ(W ) log N W × (The number of corners in A0 ) ≤ Now assume that Λd+1 (f (1) , ..., f (d) , f (d+1) ) . then by the relative hypergraph removal lemma ∃Ei , 1 ≤ i ≤ d + 1, Ei ⊆ X[d+1]\{i} := X̃i , such that Y Z 1Ei ≡ 0, X̃i 1≤i≤d+1 f (i) 1E C dµX̃i . δ() i P where δ() → 0 as → 0. Let A0 = A ∩ [δ1 N, δ2 N ]d , z = 1≤j≤d xj , gA0 := g · 1A0 for any function g then X (1) (2) (d) (d+1) Λ̃ := N 0−d fA0 (x2 , ..., xd , z)fA0 (x1 , x3 , ..., xd , z)...fA0 (x1 , x2 , ..., xd−1 , z)fA0 (x1 , ..., xd ) (x1 ,...,xd )∈A0 X ≥ N 0−d ν(x1 )...ν(xd ) (x1 ,...,xd )∈A0 & (N 0 )−d d α · (N 0 W )d φ(W ) = α. log N 0 · W (φ(W ) log N 0 )d for arbitrarily large N 0 . Now (1) (1) (d+1) Λ̃ = Ex[d] (fA0 1E1 + fA0 1E C )...(fA0 1 (1) (d+1) Now we have by the assumption Ex[d] fA0 · 1E1 ...fA0 term individually. (d+1) 1Ed +1 + fA0 1E C ) d+1 · 1Ed+1 ≡ 0 so we just need to estimate each other 20 ÁKOS MAGYAR AND TATCHAI TITICHETRAKUN (d+1) (2) (1) · 1E ± , where F ± can be either F or F C for any set F . Now Consider Ex[d] fA0 · 1E C fA0 · 1E ± ...fA0 1 2 d+1 since (d+1) (j) ≤1 0 ≤ fA0 1E ± ≤ ν(xj ), d ≥ j ≥ 2 and 0 ≤ fA j We have (d+1) (1) (2) Ex[d] fA0 ·1E C fA0 ·1E ± ...fA0 ·1E ± 1 1 d+1 ≤ (1) Ex[d] fA0 ·1E C ν(x2 )...ν(xd ) 1 Z = X̃1 f (1) ·1E C dµX2 · · · dµXd+1 . δ(). 1 In the same way, we have for any 1 ≤ i ≤ d + 1, (i) Ex[d] fA0 · 1E C Y (f (j) · 1E ± ) . δ() i j 1≤j≤d+1,j6=i So if N 0 > N (α) then (1) (2) (d) (d+1) Ex[d] fA0 (x2 , ..., xd , u)fA0 (x1 , x3 , ..., xd , u)...fA0 (x1 , ..., xd−1 , u)fA0 (x1 , ..., xd ) . δ() = o(α) 0d+1 N This is a contradiction. Hence there are & (log corners in A. Note that the number of degenerated N 0 )2d 0d corners is at most O( (logNN 0 )d ) as the corner is degenerated (and will be degenerated into a single point ) iff P (N 0 )d+1 z = 1≤j≤d xj . Hence there are at least c(α) (log corners. N 0 )2d Remark 6.1. It possible to extract and explicit bound c(α) from this argument but it is not a good bound (iterated tower type) due to the use of regularity lemma. See [7]. This is also the best bound in integers case. For d = 2, the best bound is due to Shkredov [18] which is an exponents bound. A PPENDIX A. G REEN -TAO ’ S M EASURE AND P SEUDORANDOMNESS A.1. Pseudorandom Measure Majorizing Primes. In analytic number theory, the following Mangoldt function is used as a characteristic function on primes. ( log p if n = pk , k ≥ 1 Λ(n) = 0 otherwise Primes has local obstructions that prevents it from being random : Λ(n) is concentrated on just φ(q) residue classes (mod q). The small primes will cause this kind of effect more than the larger primes as they have larger density. However, we can get rid of this kind of obstruction on all small residue classes and will not affect much what we are Q counting by the so called W-Trick[8]: Let ω(N ) be a sufficiently slowly growing function of N . Let W = p≤ω(N ) p.Let b be any positive integer with (b, W ) = 1, so by the Prime Number Theorem, we have W = exp((1 + o(1))ω(N )) and we have that PW,b is uniformly distributed (mod q) for q ≤ W . Remark A.1. We have to choose W to grow sufficiently slow in N , ω(N ) = log log N is enough. If we let W grow with N then the error term from linear form and correlation conditions would go to 0 an N → ∞. It turns out that we can choose W to be arbitrarily large fixed constant (see also remark in Sec 11 in [8] or overspill principle in [23]) and in this case for error term to go to 0 we need to let both N, W → ∞. Keeping W independent of N may be important to extgract the quantitative bound of the main theorem. We look at PN,W,b = {n : W n + b ∈ PN } in place of PN and for A ⊆ P, we look at {n : W n + b ∈ A ∩ [N ]} instead . We do this by identifying n with the original W n + b. Applying W-trick doesn’t affect much what we are counting.4 4Recall that Q p≤x (1 − p1 )−1 = eC0 log x + O(1) hence ≈ 1 N φ(W ) log W N N/W ≈ W φ(W ) = eC0 log x + O(1) The density of PW n+b in ZW n+b is eC0 log ω(N ) + O(1) W 1 (if W N ) ≈ φ(W ) log N log N CORNERS IN DENSE SUBSETS OF Pd 21 We also define the modified von-Mangoldt function by Definition A.1 (Modified von-Mangoldt function). For any fixed (b, W ) = 1, ( φ(W ) W log(W n + b) if W n + b is prime. Λb (n) = 0 otherwise. Modified Mangoldt function in dimension d is defined to be d−fold tensor product of Λb that is d Λb (x1 , ..., xd ) = Λb1 (x1 ) · · · Λbd (xd ), b = (b1 , ..., bd ) ∈ Z×d N Remark A.2. Note that the modified Mangoldt function is not supported on higher prime powers. From the Prime Number Theorem in Arithmetic Progression, we have En≤N Λb (n) ∼ 1. If W is sufficiently large then we don’t have much effect from local obstructions and the function Λb is more pseudorandom. For example, it could be shown ([10])that kΛ − 1kU s+1 [N ] = o(1) which is not true for Λ. Now we recall the definition of Green-Tao measure and the definiton of pseudorandomness measure according to [8]. Definition A.2 (Goldston-Yildirim sum). [12],[8] ΛR (n) = X d|n,d≤R We may take R = N d−1 2−d−5 µ(d) log R d Now we define the Green-Tao measure: Definition A.3 (Green-Tao’s pseudorandom measure). [8] For given small parameters 1 ≥ δ1 , δ2 > 0 , define a function νδ1 ,δ2 : ZN → R ( φ(W ) ΛR (W n+b)2 if δ1 N ≤ n ≤ δ2 N W log R νδ1 ,δ2 (n) = ν(n) = 0 otherwise Now we summarize some properties of ν that will be used • ν satisfies linear forms for any parameters depending d or α. • ν(n) ≥ d−1 2−d−6 Λb (n). To see this, we may assume that W n+b is prime then δ1 N > R, ΛR (W n+ 1) = log R = d−1 2−d−5 log N so we have our claim provided ω(N ) is sufficiently slow growing in N . Moreover, if N is a sufficiently large prime in residue class b (mod W ) and is in the support of ) ν then ν(N ) ≈ φ(W W log N . • in dimension d we define Green-Tao measure to be d-fold tensor product of ν i.e. ⊗d ν(x1 , ..., xd ) = ν(x1 )...ν(xd ). Definition A.4 (Linear Form Condition). Let m0 , t0 ∈ N be parameters then we say that ν satisfies (m0 , t0 )− linear form condition if for any m ≤ m0 , t ≤ t0 , suppose {aij }1≤i≤m are subsets of inte1≤j≤t P gers and bi ∈ ZN . Given m (affine) linear forms Li : ZtN → ZN with Li (x) = 1≤j≤t aij xj + bi for 1 ≤ i ≤ m be such that each φi is nonzero and they are pairwise linearly independent over rational. Then Y E( ν(Li (x)) : x ∈ ZtN ) = 1 + oN →∞,m0 ,t0 (1) 1≤i≤m Definition A.5 (Correlation Condition). We say that a measure ν satisfies (m0 , m1 , ..., ml2 )− correlation condition if there is a function τ : ZN → R+ such that (1) E(τ (x)m : x ∈ ZN ) = Om (1) for any m ∈ Z+ 22 ÁKOS MAGYAR AND TATCHAI TITICHETRAKUN (2) Suppose • φi , ψ (k) : ZtN → ZN (1 ≤ i ≤ l1 , 1 ≤ k ≤ l2 , l1 + l2 ≤ m0 ) are all pairwise linearly independent (over Z) linear forms (g) (g) (g) • For each 1 ≤ g ≤ l2 , 1 ≤ j < j 0 ≤ mg we have agj 6= 0, and aj ψ (g) (x) + hj , aj 0 ψ (g) (x) + (g) hj 0 are different (affine) linear forms. then, we have Ex∈Z d l1 Y N ν(φk (x)) (k) (k) ν(aj ψ (k) (x)+hj ) Q ≤ l2 Y (k) (k) (k) (k) (k) (k) τ W (aj 0 hj −aj hj 0 )+(aj 0 −aj )b X k=1 1≤j<j 0 ≤mk k=1 j=1 k=1 where W = mk l2 Y Y p≤ω(N ) p. Theorem A.1. The green-Tao measure ν satisfies linear forms and correlation conditions on any parameters that may depend on d or α (not in N ). Proof. The proof for linear form condition is the same as in [8], the correlation condition is very slight different and can be proved in the same manner. We present a sketch of a proof of the correlation condition −1 −d−5 here. Let B be a box of size R10M where M = m0 ...ml2 , R = N d 2 . It suffices to proof the following (see Theorem 9.6 in [8]) Y 2 Y 2 mk m0 l2 Y (k) (k) (k) Ex∈B ΛR (W φk (x) + b) ΛR W · (aj ψ (x) + hj ) + b k=1 j=1 k=1 ≤ CM W log R φ(W ) M Y l2 Y 1 + OM (p − 12 ) (A.1) k=1 p|∆k Where for each k ≥ 1 ∆k := Y (k) (k) (k) (k) (k) (k) W · (aj 0 hj − aj hj 0 ) + (aj 0 − aj )b 1≤j<j 0 ≤mk S 0 k Define Mj := m0 +m1 +...+mj . Write [M ] = m j=1 Ij ∪∪j=1 Imj , Ij := {j}, j ≤ m0 , Imj = (Mj−1 , Mj ] ( φi if i ∈ Ij , j ≤ m0 (k) Define ψi := (k) ψ , if i ∈ Ik , k > m0 Now for each i ∈ Ik , k ≥ m0 , write (k) (k) (k) θi (x) := W · ai−Mk−1 ψi−Mk−1 + hi−Mk−1 + b and for each i ∈ Ij , j ≤ m0 , define (i) (i) θi (x) := W φi (x) + b (i.e. ai = 1, hi = 0, ψ i = φi ) For each X ⊆ [M ], define ωX (p) = Ex∈Zd Y N 1θi (x)≡0 (mod p) i∈X 0 ) then as in [8] we can write LHS of (A.1) as the following integral Write z = (z1 , ..., zM ), z 0 = (z10 , .., zM plus a small error term Z Z Y Rzj +zj0 (2πi)−M ··· F (z, z 0 ) dzj dzj0 (A.2) 2z02 z Γ1 Γ1 j j 1≤j≤M | {z } 2M CORNERS IN DENSE SUBSETS OF Pd where Γ1 is the line <(z) = σ > 0, F (z, z 0 ) := Q p Ep (z, z 0 ), 23 where 0 Ep (z, z 0 ) := (−1)|X|+|X | ωX∪X 0 (p) X X,X 0 ⊆[M ] p P j∈X P zj + j∈X zj0 To ensure the convergence we have the following estimate Lemma A.2 (Local factor estimate). Let X ⊆ [M ] (1) If X = ∅ then ωX (p) = 1 (2) If p ≤ ω(N ), X 6= ∅ then ωX (p) = 0 ( = p−1 if |X| = 1 (3) If p > ω(N ) and X ⊆ Ik is nonempty then ωX (p) Furthermore, if |X| > 1 ≤ p−1 if |X| > 1 and p - ∆k then ωX (p) = 0. (4) If p > ω(N ) and there are k1 6= k2 such that both X ∩Ik1 , X ∩Ik2 are nonempty then ωX (p) ≤ p−2 . Proof of the Lemma. (1) The first statement is trivial. (k) (k) (k) (2) If p ≤ ω(N ), j ∈ X then W · (aj ψj + hj ) + b ≡ b (mod p) which gives the result since (b, p) = 1. (3) Suppose p > ω(N ) Firstly, if X ⊆ Ik , |X| = 1 then Ex∈Zd 1W ·(a(k) = p−1 (k) (k) ψ (x)+h )+b≡0 (mod p) N i−Mk−1 Now if |X| > 1 with i−Mk−1 i−Mk−1 j, j 0 ωX (p) ≤ Ex∈Zd N ∈ X then using the previous estimate, we have 1W ·(a(k) = p−1 . (k) (k) ψ (x)+h )+b≡0 (mod p) j−Mk−1 j−Mk−1 j−Mk−1 Now assume p - ∆k we use the estimate ωX (p) ≤ Ex∈Zd 1W ·(a(k) (k) (k) ψ (x)+h N j−Mk−1 (k) j−Mk−1 )+b≡0 j−Mk−1 (k) (k) (mod p) (k) ·1W ·(a(k) (k) (k) j−Mk−1 ψj−Mk−1 (x)+hj−Mk−1 )+b≡0 (k) (mod p) (k) Now if p|W · (aj ψj (x) + hj ) + b, p|W · (aj ψj (x) + hj ) + b then (k) (k) (k) (k) (k) (k) p|W · (aj 0 hj − aj hj 0 ) + (aj 0 − aj )b so p|∆k . Hence if p - ∆k then ωX (p) = 0. (4) Assume j ∈ X ∩ Ik1 , j 0 ∈ X ∩ Ik2 then ωX (p) ≤ Ex∈Zl 1W ·(a(k1 ) (k ) (k ) ψ 1 (x)+h 1 N j−Mk −1 1 j−Mk −1 1 j−Mk −1 )+b≡0 1 (mod p) ·1W ·(a(k2 ) (k2 ) (k2 ) j−Mk −1 ψj−Mk −1 (x)+hj−Mk −1 )+b≡0 2 2 2 P (k1 ) (k ) For i = 1, 2, write aj−M ψ 1 (x) = ts=1 Lki ,s xs k1 −1 j−Mk1 −1 and our condition becomes t X (k ) Lk1 ,s xs = −W −1 b − W −1 hj1 1 (mod p) s=1 t X (k ) Lk2 ,s xs = −W −1 b − W −1 hj2 2 (mod p) s=1 Now by assumptions, we have (Lk1 ,s )1≤s≤t and (Lk2 ,s )1≤s≤t are linearly independent. Now suppose Lj,t1 = λLj,t2 (mod p), Lj 0 ,t1 = λLj 0 ,t2 (mod p) for some λ ∈ Z then Ljt1 Lj 0 t 1 ≡ 0 (mod p) |Ljt1 Lj 0 t2 − Ljt2 Lj 0 t1 | = Ljt2 Lj 0 t2 (mod p) 24 ÁKOS MAGYAR AND TATCHAI TITICHETRAKUN L Lj 0 t1 so p - jt1 0 Ljt2 Lj t2 L Hence if N > Nk is sufficiently large then p > ω(N ) > jt1 Ljt2 This implies ωX (p) ≤ p−2 Lj 0 t1 . Lj 0 t2 Remark A.3. If we choose W according to our last argument, Since the number of linear forms we’ll consider is finite so we can choose W to be a fixed finite constant, independent of N . The rest of the argument is very similar as in [8].Now after expanding Ep (z, z 0 ), we have X (−1)|X|+|X 0 | ωX∪X 0 (p) P P Ep (z, z 0 ) = 0 j∈X zj + j∈X 0 zj p 0 X,X ⊆[M ] M X 0 0 = 1 − 1p>ω(n) (p−1−zj + p−1−zj − p−1−zj −zj ) j=1 + k X OM (p−2 ) X 0 1p>ω(N );p|∆i λ(i) p (z, z ) + X∪X 0 6⊆Iα |X∪X 0 |>1 i=0 where 0 λ(i) p (z, z ) = X∪X 0 ⊆Ii |X∪X 0 |>1 Ep(0) (z, z 0 ) = 1 + j∈X P zj + j∈X 0 zj0 OM (p−1 ) X Define p P k X p P j∈X zj + P j∈X 0 zj0 0 1p>ω(n);p|∆i λ(i) p (z, z ) i=0 then write Ep = Ep(1) (z, z 0 ) := Ep(2) (z, z 0 ) := (0) (1) (2) (3) Ep Ep Ep Ep , where Ep (z, z 0 ) (0) Ep (z, z 0 ) M Y QM j=1 (1 0 0 − 1p>ω(N ) p−1−zj )(1 − 1p>ω(N ) p−1−zj )(1 − 1p>ω(N ) p−1−zj −zj )−1 0 0 (1 − 1p≤ω(N ) p−1−zj )−1 (1 − 1p≤ω(N ) p−1−zj )−1 (1 − 1p≤ω(N ) p−1−zj −zj ) j=1 Ep(3) (z, z 0 ) := M Y 0 0 (1 − p−1−zj )(1 − p−1−zj )(1 − p−1−zj −zj )−1 j=1 Let Gi = Q (i) p Ep , noting that G3 = ζ(1+zj +zj0 ) j=1 ζ(1+zj )ζ(1+zj0 ) . QM For σ > 0, define DσM = {(zj , zj0 ) : <(zj ), <(zj0 ) ∈ (−σ, 100), 1 ≤ j ≤ M } Now suppose f is analytic on DσM , define the norm αM 0 α0M ∂ α1 ∂ ∂ α1 ∂ kf kC k (DσM ) := sup · · · f ··· 0 0 ∂z1 ∂zM ∂z1 ∂zM (α1 ,..,αM ) M) L∞ (Dσ 0 0 P(α1 ,..,α P M) αi + α0i ≤k 1 Lemma A.3. [8] Let 0 < σ < 6M then for i = 0, 1, 2, Gi is absolutely convergent in DσM and hence represent an analytic function on this domain and we have r Y log R kG0 kC r (DσM ) = OM (1 + OM (p2M σ−1 )), 0 ≤ r ≤ M log log R Qk p| i=0 ∆i CORNERS IN DENSE SUBSETS OF Pd 25 1 kG0 kC M (DσM ) ≤ exp(OM (log 3 R)) kG1 kC M (DσM ) = OM (1) kG2 kC M (DσM ) ≤ OM,W (1) k Y Y G0 (0, 0) = 1 (1 + OM (p− 2 )) i=0 p|∆i G1 (0, 0) = 1 + OM (1) M W G2 (0, 0) = φ(W ) For the proof of this lemma see lemma 10.3 and lemma10.6 in [8] with ∆ = G0 (0, 0) = Y Ep(0) Qk i=0 ∆i ; noting that k k Y Y X Y (i) = (1 + λp (0, 0)) ≤ (1 + |λp(i) (0, 0)|) p|∆ i=0 p|∆ i=0 p|∆i 1 (i) and we crudely have |λp (0, 0)| = 1 + OM (p− 2 ) Now the contour integral takes the form 0 Z Z M Y ζ(1 + zj + zj0 )Rzj +zj −M 0 dzj dzj0 (2πi) ··· G(z, z ) 0 )z 2 z 0 2 ζ(1 + z )ζ(1 + z j Γ1 Γ1 j j j j=1 with G = G0 G1 G2 . Now apply the following lemma (Lemma 10.4 in [8],see also [12],[13]) to prove the estimate A1. Lemma A.4 (Goldston-Yildirim [8],[12],[13]). Let R > 0, G(z, z 0 ) is analytic in DσM for some δ > 0 and 1 kG0 kC k (DσM ) = exp(OM,σ (log 3 R)) then 0 (2πi) −M Z Z ··· Γ1 0 G(z, z ) Γ1 = G(0, ..., 0) logM R + M X M Y ζ(1 + zj + zj0 )Rzj +zj 0 ζ(1 + zj )ζ(1 + zj0 )zj2 zj2 j=1 dzj dzj0 √ OM,σ (kG0 kC j (DσM ) ) logM −j R + OM,σ (exp(−δ R)) j=1 for some δ > 0. R EFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] D. C ONLON , J. F OX , Y. Z HAO, A relative Szemeredi theorem, Geometric and Functional Analysis, to appear. B. C OOK , A. M AGYAR , T.T ITICHETRAKUN A Multidimensional Szemeredi’s Theorem in the Primes.Preprint. B. C OOK , A. M AGYAR Constellations in Pd International Mathematics Research Notices 2012.12 (2012), 2794-2816. H. F URSTENBERG , Y. K ATZNELSON, An ergodic Szemerédi theorem for commuting trnasformations, J. Analyse Math. 31 (1978), 275-291 W.T. G OWERS, Hypergraph regularity and the multidimensional Szemerédi theorem, Annals of Math. 166/3 (2007), 897-946 W.T. G OWERS, Decompositions, approximate structure, transference, and the Hahn-Banach theorem, Bull. London Math. Soc. 42 (4) (2010), 573-606 W.T. G OWERS, Lower bound of tower type for Szemeredi’s uniformity lemma, Geom. func. anal.Vol. 7(1997) 322-337 B. G REEN AND T. TAO, The primes contain arbitrary long arithmetic progressions, Annals of Math. 167 (2008), 481-547 B. G REEN AND T. TAO, Linear equations in primes. Annals. of Math.(2) 171.3 (2010), 1753-1850. B. G REEN AND T. TAO, The Möbius Function is Strongly Orthogonal to Nilsequences. Annals. of Math.(2) 175 (2012), 541-566. B. G REEN , T. TAO AND T. Z IEGLER, An inverse theorem for the Gowers U s+1 [N ] norm, Annals of Math., 176 (2012), no.2, 1231-1372. 26 ÁKOS MAGYAR AND TATCHAI TITICHETRAKUN [12] D. G OLDSTON AND C. Y ILDIRIM, Higher correlations of divisor sums related to primes I: triple correlations, Integers: Electronic Journal of Combinatorial Number theory, 3 (2003), 1-66 [13] D. G OLDSTON AND C. Y ILDIRIM, Higher correlations of divisor sums related to primes III: small gaps between primes , Proc. London Math. Soc. 95 (2007), 653-686 [14] J. F OX AND Y. Z HAO, A short proof of the multidimensional Szemeredi’s theorem in the primes American Journal of Mathematics, to appear. [15] B. NAGLE , V. R ÖDL , M. S CHACHT, The counting lemma for regular k-uniform hypergraphs, Random Structures and Algorithms, 28(2), (2006), 113-179 [16] O. R EINGOLD , L. T REVISAN , M. T ULSIANI , S. VADHAM, Dense subsets of pseudorandom sets Electronic Colloquium of Computational Complexity, Report TR08-045 (2008) [17] I.Z. RUZSA AND E. S ZEMEREDI, Triple systems with no six points carrying three triangles, Colloq. Math. Sot. Junos Bolyai 18 (1978), 939-945. [18] I.D. S HKREDOV, On a problem of Gowers, Izv. Ross. Akad. Nauk. Ser. Mat. 70 (2006), no.2 179-221. [19] J. S OLYMOSI, Note on a generalization of Roth’s theorem, Discrete and Computational Geometry, Algorithms Combin. 25, (2003), 825-827 [20] E. S ZEMER ÉDI, On sets of integers containing no k elements in arithmetic progression, Acta Arith. 27 (1975), 299-345 [21] T. TAO, The ergodic and combinatorial approaches to Szemerédi’s theorem, Centre de Recerches Mathématiques CRM Proceedings and Lecture Notes, 43 (2007), 145–193. [22] T. TAO, The Gaussian primes contain arbitrarily shaped constellations, J. Analyse Math., 99/1 (2006), 109-176 [23] T. TAO, The prime tuples conjecture, sieve theory, and the work of Goldston-Pintz-Yildirim, Motohashi-Pintz, and Zhang, (2013) [24] T. TAO, A variant of the hypergraph removal lemma Journal of Combinatorial Theory, Series A 113.7 (2006): 1257-1280 [25] T. TAO AND T. Z IEGLER, The primes contain arbitrarily long polynomial progressions, Acta Math., 201 (2008), 213-305 E-mail address: tatchai@math.ubc.ca