Fourier based statistics for irregular spaced spatial data

Fourier based statistics for irregular spaced spatial data Suhasini Subba Rao Department of Statistics, Texas A&M University, College Station, U.S.A. suhasini@stat.tamu.edu December 8, 2015 Abstract A class of Fourier based statistics for irregular spaced spatial data is introduced, examples include, the Whittle likelihood, a parametric estimator of the covariance function based on the L2 -contrast function and a simple nonparametric estimator of the spatial autocovariance which is a non-negative function. The Fourier based statistic is a quadratic form of a discrete Fouriertype transform of the spatial data. Evaluation of the statistic is computationally tractable, requiring O(nb) operations, where b are the number Fourier frequencies used in the definition of the statistic (which varies according to the application) and n is the sample size. The asymptotic sampling properties of the statistic are derived using mixed spatial asymptotics, where the number of locations grows at a faster rate than the size of the spatial domain and under the assumption that the spatial random field is stationary and the irregular design of the locations are independent, identically distributed random variables. The asymptotic analysis allows for the case that the frequency grid over which the estimator is defined is unbounded. These results are used to obtain the asymptotic sampling properties of certain estimators and also in variance estimation. Keywords and phrases: Fixed and increasing frequency domain asymptotics, mixed spatial asymptotics, random locations, spectral density function, stationary spatial random fields. 1 Introduction In recent years irregular spaced spatial data has become ubiquitous in several disciplines as varied as the geosciences to econometrics. The analysis of such data poses several challenges which do not arise in data which is sampled on a regular lattice. One important problem is the computational costs when dealing with large irregular sampled data sets. If spatial data are sampled on a regular lattice then algorithms such as the Fast Fourier transform can be employed to reduce the computational burden (see, for example, Chen, Hurvich, and Lu (2006)). Unfortunately, such algorithms have little benefit if the spatial data are irregularly sampled. To address this issue, within the spatial domain, several authors, including, Vecchia (1988), Cressie and Huang (1999), 1 Stein, Chi, and Welty (2004), have proposed estimation methods which are designed to reduce the computational burden. In contrast to the above references, Fuentes (2007) argues that working within the frequency domain can simplify the problem. Fuentes assumes that the irregular spaced data can be embedded on a grid and the missing mechanism is deterministic and ‘locally smooth’. Based on these assumptions Fuentes proposes a tapered Whittle estimator to estimate the parameters of a spatial covariance function. However, a possible drawback is that, if the locations are extremely irregular, the local smooth assumption will not hold. Therefore, in order to work within the frequency domain, methodology and inference devised specifically for irregular sampled data is required. Matsuda and Yajima (2009) and Bandyopadhyay and Lahiri (2009) have pioneered this approach, by assuming that the irregular locations are independent, identically distributed random variables (thus allowing the data to be extremely irregular) and define the irregular sampled discrete Fourier transform (DFT) as d/2 Jn (!) = n n X Z(sj ) exp(is0j !), (1) j=1 /2]d where sj 2 [ /2, denotes the spatial locations observed in the space [ /2, /2]d and {Z(sj )} denotes the spatial random field at these locations. It’s worth mentioning a similar transformation on irregular sampled data goes back to Masry (1978), who defines the discrete Fourier transform of Poisson sampled continuous time series. Using the definition of the DFT given in (1) Matsuda and Yajima (2009) define the Whittle likelihood for the spatial data as ◆ Z ✓ |Jn (!)|2 log f✓ (!) + d!, (2) f✓ (!) ⌦ where ⌦ is a compact set in Rd , here we will assume that ⌦ = 2⇡[ C, C]d and f✓ (!) is the candidate parametric spectral density function. A clear advantage of this approach is that it avoids the inversion of a large matrix. However, one still needs to evaluate the integral, which can be computationally quite difficult, especially if the aim is to minimise over the entire parameter space of ✓. For frequency domain methods to be computationally tractable and to be used as a viable alternative to ‘spatial domain’ methods, the frequency domain needs to be gridified in such a way that tangable estimators are defined on the grid. The problem is how to choose an appropriate lattice on Rd . A possible solution can be found in Bandyopadhyay and Subba Rao (2015), Theorem 2.1, who show that on the lattice {! k = ( 2⇡k1 , . . . , 2⇡kd ); k = (k1 , . . . , kd ) 2 Zd } the DFT {Jn (! k )} is ‘close to uncorrelated’ if the random field is second order stationary and the locations {sj } are independent, identical, uniformly distributed random variables. Heuristically, this result suggests that {Jn (! k )} contains all the information about the random field, such that one can perform the analysis on the transformed data {Jn (! k )}. For example, rather than use (2) one can use the discretized likelihood ✓ ◆ C X 1 |Jn (! k )|2 log f (! ) + , (3) ✓ k d f✓ (! k ) k1 ,...,kd = C 2 to estimate the parameters. Indeed Matsuda and Yajima (2009), Remark 2, mention that in practice one should use the discretized likelihood to estimate the parameters, but, unfortunately, they did not derive any results for the discretized likelihood. Integrated periodogram statistics such as the discretized Whittle likelihood defined above, are widely used in time series and their properties well understood. However, with the exception of Bandyopadhyay, Lahiri, and Norman (2015), as far as we are aware, there does not exist any results for integrated periodograms of irregularly sampled spatial data. Therefore one of the objectives of this paper is to study the class of integrated statistics which have the form Qa, (g✓ ; r) = 1 d a X g✓ (! k )Jn (! k )Jn (! k+r ) k1 ,...,kd = a r 2 Zd , (4) where ✓ is a finite dimension parameter, Jn (! k ) is defined in (1) and g✓ : Rd ! R is a weight function which depends on the application. Note that for r = 0, a = C and g✓ (!) = f✓ (!) 1 we have the discretized Whittle likelihood (up to a non-random constant). We have chosen to consider the case r 6= 0 because in the case that the locations are uniformly sampled, Qa, (g✓ ; 0) and {Qa, (g✓ ; r); r 6= 0} asymptotically have the same variance, which we exploit to estimate the variance of Qa, (g✓ ; 0) in Section 5. In terms of computation, evaluation of {Jn (! k ); k = (k1 , . . . , kd ), kj = a, . . . , a} requires O(ad n) operations (as far as are aware the FFT cannot be used to reduce the number of operations for irregular spaced data). However, once {Jn (! k ); k = (k1 , . . . , kd ), kj = a, . . . , a} has been evaluated the evaluation of Qa, (g✓ ; r) only requires O(ad ) operations. To understand why such a class of statistics may be of interest in spatial statistics, consider the case r = 0 and replace |Jn (! k )|2 in Qa, (g✓ ; 0) with the spectral density, f : Rd ! R of the spatial process {Z(s); s 2 Rd }. This replacement suggests Qa, (g✓ ; 0) is estimating the functional I(g✓ ; a ) where Z ⇣ a⌘ 1 I g✓ ; = g✓ (!)f (!)d!. (5) (2⇡)d [ a/ ,a/ ]d In Section 3 we show that this conjecture is true up to a multiplicative constant. Before giving examples of functionals I g✓ ; a of interest we first compare Qa, (g✓ ; 0) to the integrated periodogram estimators which are often used in time series (see for example, Walker (1964), Hannan (1971), Dunsmuir (1979), Dahlhaus and Janas (1996), Can, Mikosch, and Samorodnitsky (2010) and Niebuhr and Kreiss (2014)). However, there are some fundamental di↵erences, which mean the analysis of (4) is very di↵erent to those in classical time series. We observe that the user-chosen frequency grid {! k ; k = (k1 , . . . , kd ), kj 2 [ a, a]} used to define Qa, (g✓ ; r) influences the limits of the functional I(g; a ). This is di↵erent to regular sampled locations where frequency grid is defined over [0, 2⇡]d (outside this domain aliasing occurs). Returning to irregular sampled spatial data, in certain situations, such as the Whittle likelihood described in (3) or the spectral density estimator (described in Section 2), bounding the frequency grid to a = C is necessary. In the case of the Whittle likelihood this is important, because |f✓ (!)| ! 0 as k!k ! 1 (for any norm k · k), thus the discretized Whittle likelihood is only well defined over a bounded frequency grid. 3 The choice of C is tied to how fast the tails in the parametric class of spectral density functions {f✓ ; ✓ 2 ⇥} (where ⇥ is a compactly supported parameter space) decay to zero. In contrast, in most situations the aim is to estimate the functional I (g✓ ; 1). For example, the spatial covariance can be written as c(v) = I(eiv· ; 1) or a loss function defined over all frequencies. In this case, the only plausible method for estimating I(g✓ ; r) without asymptotic bias is to allow the frequency grid {! k ; k = (k1 , . . . , kd ), a  ki  a} to be unbounded, in other words let a/ ! 1 as ! 1. In which case I(g✓ ; a ) ! I(g; 1) as the domain ! 1. To understand whether this is feasible, we recall that if the spatial process were observed only on a grid then aliasing means that it is impossible to estimate I(g✓ ; 1). On the other hand, unlike regularly spaced or near regularly spaced data, ‘truely’ irregular sampling means that the DFT can estimate high frequencies, without the curse of aliasing (a phenomena which was noticed as early as Shapiro and Silverman (1960) and Beutler (1970)). In this case, in the definition of Qa, (g✓ ; r) there’s no need for the frequency grid to be bounded, and a can be magnitudes larger than . One contribution of this paper is to understand how an unbounded frequency grid influences inference. As alluded to in the previous paragraph, in this paper we derive the asymptotic sampling properties of Qa, (g✓ ; r) under the asymptotic set-up that the spatial domain ! 1. The main focus will be the mixed asymptotic framework, introduced in Hall and Patil (1994) (see also Hall, Fisher, and Ho↵man (1994) and used in, for example, Lahiri (2003), Matsuda and Yajima (2009), Bandyopadhyay and Lahiri (2009), Bandyopadhyay et al. (2015) and Bandyopadhyay and Subba Rao (2015)). This is where ! 1 (the size of the spatial domain grows) and the number of observations n ! 1 in such a way that d /n ! 0. In other words, the sampling density gets more dense as the spatial domain grows. The results di↵er slightly under the pure increasing domain framework (see Bandyopadhyay et al. (2015) who compares the mixed and pure increasing domain framework), where d /n ! c (0 < c < 1) as n ! 1 and ! 1. In Section 3.3 we outline some of these di↵erences. We should be mention that under the fixed-domain asymptotic framework (where is kept fixed but the number of locations, n grows) considered in Stein (1994, 1999), Qa, (g✓ ; r) cannot consistently estimate I(g✓ ; a ). There is a non-trivial bias (which can be calculated using the results in Section 3.1) and a variance that does not asymptotically converge to zero as n ! 1. Obtaining the moments of Qa, (g✓ ; r) in the case that a = O( ) is analogous to those for regular sampled time series and spatial processes and follows from the DFT analysis given in Section 2.1. However, when the frequency grid is unbounded, the number of terms within the summand of Qa, (g✓ ; r) is O(ad ) which is increasing at a rate faster than the standardisation d . This turns the problem into one that is technically rather challanging; bounds on the cumulant of DFTs cannot be applied to estimators defined on an increasing frequency grid. One of the aims in this paper is to develop the machinery for the analysis of estimators defined with an unbounded frequency grid. We show that under Gaussianity of the spatial process and some mild conditions on the spectral density function, the choice of a does not play a significant role in the sampling properties of Qa, (g✓ ; r). In particular, we show that Qa, (g✓ ; r) is a mean squared consistent estimator of I(g✓ ; 1) (up to a finite constant) for any a so long as a/ ! 1 as ! 1. Furthermore, by constraining the rate of growth of the frequency grid to a = O( k ) for some k such that 1  k < 1 (both 4 the bounded and unbounded frequency grid come under this canopy), the asymptotic sampling properties (mean, variance and asymptotic normality) of Qa, (g✓ ; r) can be derived. On the other hand, we show that if the spatial random process is non-Gaussian then the additional constraint that ( a)d /n2 ! c < 1 is required as , a, n ! 1. This bound is quite strong and highlights stark di↵erences between the statistical properties of a Gaussian random field to those of a non-Gaussian random field. The results in this paper (at least in the case of Gaussianity) can be used to verify condition C.4(ii) in Bandyopadhyay et al. (2015). We now summarize the paper. In Section 2 we obtain the properties of the DFT Jn (! k ) and give examples of estimators which can be written as Qa, (g✓ ; r). In Section 3 we derive the asymptotic sampling properties of Qa, (g✓ ; r). In Section 4 we apply the results derived in Section 3 to some of the statistics proposed in Section 2. The expressions for the variance Qa, (g✓ ; 0) are complex and difficult to estimate in practice. In Section 5, we exploit the results in Section 3.4, where we show that in the case that the locations are uniformly distributed {Qa, (g✓ ; r)} forms a ‘near uncorrelated’ sequence over r which asymptotically has the same variance when r/ ! 0 as ! 1. Using these results we define an ‘orthogonal’ sample which can be used to obtain a simple estimator of the variance of Qa, (g✓ ; 0) which is computationally fast to evaluate and has useful theoretical properties. An outline of the proofs in the case of uniform sampling can be found in Appendix A. The proofs for non-uniform sampling and technical proofs can be found in the supplementary material, Subba Rao (2015b); many of these results build on the work of Kawata (1959) and may be of independent interest. 2 Preliminary results and assumptions We observe the spatial random field {Z(s); s 2 Rd } at the locations {sj }nj=1 . Throughout this paper we will use the following assumptions on the spatial random field. Assumption 2.1 (Spatial random field) (i) {Z(s); s 2 Rd } is a second order stationary random field with mean zero and covariance function c(s1 s2 ) = cov(Z(s1 ), Z(s2 )|s1 , s2 ). We R define the spectral density function as f (!) = Rd c(s) exp( is0 !)ds. (ii) {Z(s); s 2 Rd } is a stationary Gaussian random field. In the following section we require the following definitions. For some finite 0 < C < 1 and > 0, let ( C |s| 2 [ 1, 1] (s) = . (6) (1+ ) C|s| |s| > 1 Q P Let (s) = dj=1 (sj ). To minimise notation we will often use ak= a to denote the multiple P P sum ak1 = a . . . akd = a . Let k · k1 denote the `1 -norm of a vector and k · k2 denote the `2 -norm. 5 2.1 Properties of the DFTs We first consider the behaviour of the DFTs under both uniform and non-uniform sampling of the locations. Below we summarize Theorem 2.1, Bandyopadhyay and Subba Rao (2015), which defines frequencies where the Fourier transform is ‘close to uncorrelated’, this result requires the following additional assumption on the distribution of the spatial locations. Assumption 2.2 (Uniform sampling) The locations {sj } are independent uniformly distributed random variables on [ /2, /2]d . Theorem 2.1 Let us suppose that {Z(s); s 2 Rd } is a stationary spatial random field whose covariance function (defined in Assumption 2.1(i)) satisfies |c(s)|  1+ (s) for some > 0. Furthermore, the locations {sj } satisfy Assumption 2.2. Then we have ( d f (! k ) + O( 1 + n ) k1 = k2 (= k) cov [Jn (! k1 ), Jn (! k2 )] = O( d1 b ) k1 k2 6= 0 where b = b(k1 k2 ) denotes the number of zero elements in the vector k1 k2 , ! k = ( 2⇡k1 , . . . , 2⇡kd ), k 2 Zd and the bounds are uniform in k, k1 , k2 2 Zd . PROOF See Theorem 2.1, Bandyopadhyay and Subba Rao (2015). ⇤ To understand what happens in the case that the locations are not uniformly sampled, we adopt the assumptions of Hall and Patil (1994), Matsuda and Yajima (2009) and Bandyopadhyay and Lahiri (2009) and assume that {sj } are iid random variables with density 1d h( · ), where h : [ 1/2, 1/2]d ! R. We use the following assumptions on the sampling density h. Assumption 2.3 (Non-uniform sampling) The locations {sj } are independent distributed random variables on [ /2, /2]d , where the density of {sj } is 1d h( · ), and h(·) admits the Fourier representation h(u) = X j exp(i2⇡j 0 u), j2Zd P Q where j2Zd k j k1 < 1 such that | j |  C di=1 |ji | (1+ ) I(ji 6= 0) (for some > 0). This assumption is satisfied if the second derivative of h is bounded on the d-dimensional torus [ 1/2, 1/2]d . m1 +...,md h(s ,...,s ) 1 d Note that if h is such that sups2[ 1/2,1/2]d | @ @sm1 ...@s | < 1 (0  mi  2) but h is md 1 d Q not continuous on the d-dimensional torus [ 1/2, 1/2]d then | j |  C di=1 |ji | 1 I(ji 6= 0) and the above condition will not be satisfied. However, this assumption can be induced by tapering the e j ), where X(s e j ) = t(sj )X(sj ), t(s) = Qd t(si ) observations such that X(sj ) is replaced with X(s i=1 and t is a weight function which has a bounded second derivative, t( 1/2) = t(1/2) = 0 and e j ) instead of X(sj ), in all the derivations we can replace the t0 (1/2) = t0 ( 1/2) = 0. By using X(s density h(s) with t(s)h(s). This means the results now rely on the Fourier coefficients of t(s)h(s), 6 R Q which decay at the rate | [ 1/2,1/2]d t(s)h(s) exp(i2⇡j 0 s)ds|  C di=1 |ji | 2 I(ji 6= 0), and thus the above condition is satisfied. Note that Matsuda and Yajima (2009), Definition 2, use a similar data-tapering scheme to induce a similar condition. We now show that under this general sampling scheme the near ‘uncorrelated’ property of the DFT given in Theorem 2.1 does not hold. Theorem 2.2 Let us suppose that {Z(s); s 2 Rd } is a stationary spatial random field whose covariance function (defined in Assumption 2.1(i)) satisfies |c(s)|  1+ (s) for some > 0. Furthermore, the locations {sj } satisfy Assumption 2.3. Then we have ✓ ◆ c(0) k2 k1 d 1 cov [Jn (! k1 ), Jn (! k2 )] = h , (k2 k1 ) if (! k1 ) + +O , (7) n P where the bounds are uniform in k1 , k2 2 Zd and h , r i = j2Zd j r j . ⇤ PROOF See Appendix B. In the case of uniform sampling j = 0 for all j 6= 0 and 0 = 1, in this case cov [Jn (! k1 ), Jn (! k2 )] = [f (! k1 ) + c(0) d /n]I(k1 = k2 ) + O( 1 ). Thus we observe that Theorem 2.2 includes Theorem 2.1 as a special case (up to the number of zeros in the vector k1 k2 ; if b(k1 k2 )  (d 2) Theorem 2.1 gives a faster rate). On the other hand, we note that in the case the locations are not sampled from a uniform distribution the DFTs are not (asymptotically) uncorrelated. However, they do satisfy the property ✓  ◆ d I(k1 6= k2 ) 1 cov [Jn (! k1 ), Jn (! k2 )] = O 1 + 1+ + . kk1 k2 k1 n In other words, the correlations between the DFTs decay the further apart the frequencies. A similar result was derived in Bandyopadhyay and Lahiri (2009) who show that the correlation between Jn (! 1 ) and Jn (! 2 ) are asymptotic uncorrelated if their frequencies are ‘asymptotically distant’ such that d k! 1 ! 2 k1 ! 1. Remark 2.1 In both Theorems 2.1 and 2.2 we use the assumption that |c(s)|  assumption is satisfied by a wide range of covariance functions. Examples include: 1+ (sj ). This • The Wendland covariance, since its covariance is bounded and has a compact support. p p 1 • The Matern covariance, which for ⌫ > 0 is defined as c⌫ (ksk2 ) = (⌫)2 2vksk2 )⌫ K⌫ ( 2⌫ksk2 ) ⌫ 1( (K⌫ is the modified Bessel function of the second kind). To see why, we note that if ⌫ > 0 then c⌫ (s) is a bounded function. Furthermore, for large ksk2 we note that c⌫ (ksk2 ) ⇠ p C⌫ ksk⌫2 0.5 exp( 2vksk2 ) as ksk2 ! 1 (where C⌫ is a finite constant). Thus by using the inequality q d 1/2 (|s1 | + |s2 | + . . . + |sd |)  s21 + s22 + . . . + s2d  (|s1 | + |s2 | + . . . + |sd |). we can show |c⌫ (s)|  1+ (s) for any > 0. 7 Remark 2.2 (Random vs Fixed design of locations) Formally, the methods in this paper are developed under the assumption that the locations are random. Of course, as a referee pointed out, it is unclear whether a design is random or not. However, in the context of the current paper, ‘random design’ really refers to the locations being ‘irregular’ in the sense of not being defined on or in the proximity of a lattice (such a design would be considered ‘near regular’). A simple way to check for this is to evaluate the Fourier transform of the locations n br = 1X exp(is0j ! r ). n j=1 In the case that the design is random br is an estimator of r (the Fourier coefficients of h(·)) and br ⇡ 0 for krk1 large. On the other hand, if the design is ‘near regular’ {br } would be periodic on the grid Zd . Therefore, by making a plot of {br ; r 2 [ Kn1/d , Kn1/d ]d } (for some K 2 Z+ ), we can search for repetitions at shifts of n1/d , ie. br ⇡ br+n1/d where n1/d = (n1/d , . . . , n1/d ). If br ⇡ 0 for large krk1 , then the locations are sufficiently irregular such that we can treat them as random and the results in this paper are valid. On the other hand repetitions suggest the locations lie are ‘near regular’ and the results in this paper do not hold in this case. In the case that spatial locations are defined on a regular lattice, then it is straightforward to transfer the results for discrete time time series to spatial data on the grid (in this case a = = n1/d ). 2.2 Assumptions for the analysis of Qa, (·) In order to asymptotically analyze the statistic, Qa, (g✓ ; r), we will assume that the spatial domain [ /2, /2]d grows as the number of observations n ! 1 and we will mainly work under the mixed domain framework described in the introduction, where d /n ! 0 as ! 1 and n ! 1. Assumption 2.4 (Assumptions on g✓ (·)) (i) If the frequency grid is bounded to [ C, C]d (thus ✓ (!) a = C ), then sup!2[ C,C]d |g✓ (!)| < 1 and for all 1  j  d, sup!2[ C,C]d | @g@! | < 1. j (i) If the frequency grid is unbounded, then sup!2Rd |g✓ (!)| < 1 and for all 1  j  d, ✓ (!) sup!2Rd | @g@! | < 1. j In the case that a = C (where |C| < 1), the asymptotic properties of Qa, (g✓ ; r) can be mostly derived by using either Theorems 2.1 or 2.2. For example, in the case that the locations are uniformly sampled, it follows from Theorem 2.1 that ( d I(g✓ ; C) + O( n + 1 ) r = 0 E[Qa, (g✓ ; r)] = O( 1 ) otherwise where I(g✓ ; ·) is defined in (5), with d /n ! 0 as ! 1 and n ! 1 (the full details will be given in Theorem 3.1). However, this analysis only holds in the case that a = O( ), in other words the frequency grid is bounded. Situations where this statistic is of interest are rare, notable exceptions include the Whittle likelihood and the spectral density estimator. But usually our aim is to estimate 8 the functional I(g✓ ; 1), therefore the frequency grid {! k ; k = (k1 , . . . , kd ), a  ki  a} should be unbounded. In this case, using Theorems 2.1 or 2.2 to analyse E[Qa, (g✓ ; r)] will lead to errors of the order O( a2 ), which cannot be controlled without placing severe restrictions on the expansion of the frequency grid. Below we state the assumptions required on the spatial process. Below each assumption we explain where it is used. Note that Assumption 2.5(a) is used for the bounded frequency grid and 2.5(b) is (mainly) used for the nonbounded frequency grid (however the same assumption can also be used for the bounded frequency, see the comments below 2.5(b)). It is unclear which Assumption 2.5(a) or (b) is stronger, since neither assumption implies the other. (a) |c(s)|  Assumption 2.5 (Conditions on the spatial process) 1+ (s) Required for the bounded frequency grid, to obtain the DFT calculations. (b) For some > 0, f (!)  (!) Required for the unbounded frequency grid - using this assumption instead of (a) in the case of an bounded frequency grid leads to slightly larger errors bounds in the derivation of the first and second order moments. This assumption is also used to obtain the CLT result for both the bounded and unbounded frequency grids. (c) For all 1  j  d and some > 0, the partial derivatives satisfy | @f@!(!) | j (!). We use this condition to approximate sums with integral for both the bounded and unbounded frequency grids. It is also used to make a series of approximations to derive the limiting variance in the case that the frequency grid is unbounded. (d) For some d f (!) > 0, | @!@1 ,...,@! | d (!). Required only in the proof of Theorem 3.1(ii)(b). Remark 2.3 Assumption 2.5(b,c,d) appears quite technical, but it is satisfied by a wide range of spatial covariance functions. For example, the spectral density of the Matern covariance defined in Remark 2.1 is f⌫ (!) = (1 + k!k22 ) (⌫+d/2) . It is straightforward to show that this spectral density satisfies Assumption 2.5(b,c,d), noting that the used to define will vary with ⌫, dimension d and the order of the derivative f⌫ (·). We show in Appendix F, Subba Rao (2015b), that E[Qa, (g✓ ; r)] ! h , r iI ⇣ g✓ ; a⌘ a ⇤1 X + E Z(sj ) exp( is ! r ) g(! k ), | {z } n k= a ⇥ 0 2 =c(0) r where I(g✓ ; a ) is defined in (5). We observe that in the case that the frequency grid is bounded then the second term of the above is asymptotically negligible. However, in the case that the frequency grid is unbounded then the second term above plays a role, indeed it can induce a bias unless the 9 rate of growth of the frequency grid is constrained such that ad /n ! 0. Therefore, the main focus of this paper will be on a slight variant of the statistic Qa, (g✓ ; r), defined as a n 1 X 1X g✓ (! k ) Z(sj )2 exp( is0j ! r ). n n e a, (g✓ ; r) = Qa, (g✓ ; r) Q (8) j=1 k= a e a, (g✓ ; r) avoids some of the bias issues when the frequency grid is unbounded. For the rest of this Q e a, (g✓ ; r). Note that Matsuda and Yajima (2009) and Bandyopadhyay paper our focus will be on Q et al. (2015) use a similar bias correction method. The results for the non-bias corrected version (Qa, (g✓ ; r)) can be found in Appendix F, Subba Rao (2015b). Below we give some examples of statistics of the form (4) and (8) which satisfy Assumption 2.4. Example 2.1 (Fixed frequency grid) (i) The discretized Whittle likelihood given in equation (3) can be written in terms of Qa, (g✓ ; 0) where 1 d C X k1 ,...,kd = C ✓ |Jn (! k )|2 log f✓ (! k ) + f✓ (! k ) ◆ = Qa, (f✓ 1 ; 0) + 1 d C X log f✓ (! k ) k1 ,...,kd = C and {f✓ (·); ✓ 2 ⇥} (where ⇥ has compact support) is a parametric family of spectral density functions. The choice of a = C depends on the rate of decay of f✓ (!) to zero. (ii) A nonparametric estimator of the spectral density function f (!) is fb ,n (!) = X Wb (! ! k )|Jn (! k )|2 k where Wb (!) = b d case we set a = /2. a 1 X Wb (! n k= a Qd !j j=1 W ( b ) n !k ) 1X 1 e Z(sj )2 = d Q a, (Wb , 0), n b j=1 and W : [ 1/2, 1/2] ! R is a spectral window. In this Example 2.2 (Unbounded frequency grid) test statistic of the type e a, (g; r) e an (g; r) = Q (i) In Bandyopadhyay and Subba Rao (2015) a where r 2 Zd /{0}, is proposed. We show in Section 3.4 under the assumptions that the locations are uniformly e a, (g; r)] ⇡ 0. On the other hand, if the spatial process is distributed and r 6= 0, that E[Q nonstationary, this property does not hold (see Bandyopadhyay and Subba Rao (2015) for the details). Using this dichotomy, Bandyopadhyay and Subba Rao (2015) test for stationarity of a spatial process. (ii) A nonparametric estimator of the spatial (stationary) covariance function is ! ✓ ◆ ✓ ◆ a 2v 1 X 2v 0 2 0 b cn (v) = T |Jn (! k )| exp(iv ! k ) = T Qa, (eiv · ; 0) d k= a 10 (9) Q where T (u) = dj=1 T (uj ) and T : R :! R is the triangular kernel defined as T (u) = (1 |u|) for |u|  1 and T (u) = 0 for |u| > 1. It can be shown that the Fourier transform of ĉn (v) is Z a X 0 b f (!) = b cn (v) exp( i! v)dv = |Jn (! k )|2 Sinc2 [ (! k !)] Rd k= a Qd ˆ where Sinc( !/2) = j=1 sinc( !j /2) and sinc(!) = sin(!)/!. Clearly, f (!) is nonnegative, therefore, the sample covariance function {b cn (v)} is a non-negative definite function and thus a valid covariance function. Hall et al. (1994) also propose a nonparametric estimator of the spatial covariance function which has a valid covariance function. Their scheme is a three-stage scheme; (i) a nonparametric pre-estimator of the covariance function is evaluated (ii) the Fourier transform of the pre-estimator covariance function is evaluated over Rd (iii) negative values of the Fourier transform are placed to zero and the inverse Fourier transform of the result forms an estimator of the spatial covariance function. The estimator (9) avoids evaluting Fourier transforms over Rd and is simpler to compute. (iii) We recall that Whittle likelihood can only be defined on a bounded frequency grid. This can be an issue if the observed locations are dense on the spatial domain and thus contain a large amount of high frequency information which would be missed by the Whittle likelihood. An alternative method for parameter estimation of a spatial process is to use a di↵erent loss function. Motivated by Rice (1979), consider the quadratic loss function Ln (✓) = 1 d a X k1 ,...,kd = a |Jn (! k )|2 f✓ (! k ) 2 , (10) and let ✓bn = arg min✓2⇥ Ln (✓) or equivalently solve r✓ Ln (✓) = 0, where r✓ Ln (✓) = 2 d a X k1 ,...,kd = a <r✓ f✓ (! k ) |Jn (! k )|2 = Qa, ( 2r✓ f✓ (·); 0) + a 2 X d f✓ (! k ) + f✓ (! k )r✓ f✓ (! k ), (11) k= a and <X denotes the real part of the random variable X. It is well known that the distributional properties of a quadratic loss function are determined by its first derivative. In particular, the asymptotic sampling properties of ✓bn are determined by Qa, ( 2<r✓ f (·; ✓);0). e a, (g✓ ; r), using instead To minimize notation, from now onwards we will suppress the ✓ in Q e a, (g; r) (and only give the full form when necessary). Q 3 ea, (g; r) Sampling properties of Q e a, (g; r) is a consistent estimator of I(g; a ) In this section we show that under certain conditions Q (or some multiple of it), where I(g; a ) is defined in (5). We note that I g; a ! I(g; 1) if a/ ! 1 as a ! 1 and ! 1. 11 3.1 ea, (g; r) The mean of Q e a, (g; r). As we mentioned earlier, it is unclear how to choose We start with the expectation of Q e a, (g; r). We show if sup!2Rd |g(!)| < 1, the the number of frequencies a in the definition of Q e a, (g; r). To show choice of a does not play a significant role in the asymptotic properties of Q this result, we note that by not constraining a, such that a = O( ), then the number of terms in the sum Qa, (g; r) grows at a faster rate than the standardisation d . In this case, the analysis of e a, (g; r) requires more delicate techniques than those used to prove Theorems 2.1 and 2.2. We Q e a, (g; r), which gives a flavour of our start by stating some pertinent features in the analysis of Q e a, (g; r) as a quadratic form it is straightforward to show that approach. By writing Q E [Qa, (g; r)] = c2 a X g(! k ) k= a 1 d Z [ /2, /2]d c(s1 s2 ) exp(i! 0k (s1 s2 ) is02 ! r )h ⇣s ⌘ ⇣s ⌘ 1 2 h ds1 ds2 , where c2 = n(n 1)/n2 . The proof of Theorems 2.1 and 2.2 are based on making a change of variables v = s1 s2 and then systematically changing the limits of the integral. This method works if the frequency grid [ a/ , a/ ]d is fixed for all . However,hif the frequency grid [ a/ , a/ ]d i e a, (g; r) has the disadvantage is allowed to grow with , applying this brute force method to E Q h i e a, (g; r) . Instead, to further the analysis, that it aggregrates the errors within the sum of E Q R 1 0 we replace c(s1 s2 ) by its Fourier transform c(s1 s2 ) = (2⇡) s2 ))d! and d Rd f (!) exp(i! (s1 focus on the case that the sampling design is uniform; h(s/ ) = d I[ /2, /2] (s) (later we consider h i e a, (g; r) to the Fourier transforms of step general densities). This reduces the first term in E Q functions, which is the product of sinc functions, Sinc( 2 !) (noting that the Sinc function is defined in Example 2.2). Specifically, we obtain ✓ ◆ ✓ ◆ Z a h i X c ! ! 2 e a, (g; r) = E Q g(! k ) f (!)Sinc + k⇡ Sinc + (k + r)⇡ d! 2 2 (2⇡)d d R k= a " # Z a c2 1 X 2y = Sinc(y)Sinc(y + r⇡) d g(! k )f ( ! k ) dy, ⇡ d Rd k= a where the last line above is due to a change ofh variables y = 2! + k⇡. Since i the spectral density 2y 1 Pa function is absolutely integrable it is clear that d k= a g(! k )f ( ! k ) is uniformly bounded h i e a, (g; r) is finite for all . Furthermore, if f ( 2y ! k ) were replaced with over y and that E Q f ( ! k ), then what remains in the integral are two shifted sinc functions, which is zero if r 2 Zd /{0}, i.e. " # Z a h i X c 1 2y 2 e a, (g; r) = E Q Sinc(y)Sinc(y + r⇡) d g(! k )f ( ! k ) dy + R, ⇡ d Rd k= a where R = c2 ⇡d Z Rd Sinc(y)Sinc(y + r⇡) " a 1 X d k= a 12 ✓ g(! k ) f ( 2y !k ) f ( !k ) ◆# dy. In the following theorem we show that under certain conditions on f , R is asymptotically negligible. Let b = b(r) denote the number of zero elements in the vector r 2 Zd . Theorem 3.1 Let I(g; ·) be defined as in (5). Suppose Assumptions 2.1(i) and 2.2 hold. (i) If Assumptions 2.4(i) and 2.5(a,c) hold, then we have ( h i O( d1 b ) r 2 Zd /{0} e a, (g; r) = E Q I (g; C) + O( 1 ) r = 0 (12) (ii) Suppose Assumptions 2.4(ii) holds and h i e a, (g; r) < 1. (a) Assumption2.5(b) holds, then supa E Q (b) Assumption 2.5(b,c,d) holds and {m1 , . . . , md b } is the subset of non-zero values in r = (r1 , . . . , rd ), then we have ⌘ ( ⇣ 1 Qd b h i O (log + log |m |) r 2 Zd /{0} j d b j=1 e a, (g; r) = E Q . (13) I g; a + O log + n1 r=0 ⇣ ⌘ Q (c) If only Assumption 2.5(b,c) holds, then the O d1 b dj=1b (log + log |mj |) term in (b) is replaced with the slower rate O 1 (log + log krk1 ) . Note that the above bounds for (b) and (c) are uniform in a. ⇤ PROOF See Appendix A.1. e a, (g; r) is estimating zero. It would appear that these terms We observe that if r 6= 0, then Q don’t contain any useful information, however in Section 5 we show how these terms can be used to estimate nuisance parameters. h i e a, (g; r) in the case that the locations are not from a uniform distriIn order to analyze E Q bution we return to (12) and replace c(s1 s2 ) by its Fourier representation and also replace the P sampling density h(s/ ) by its Fourier representation h(s/ ) = j2Zd j exp(i2⇡j 0 s/ ) to give h i e a, (g; r) E Q ◆ Z 1 ✓ a c2 X 1 X 2y = g(!k ) f ! k Sinc(y)Sinc(y + (r j 1 j 2 )⇡)dy. j1 j2 d ⇡d 1 j 1 ,j 2 2Z k= a This representation allows us to use similar techniques to those used in the uniform sampling case to prove the following result. Theorem 3.2 Let I(g; ·) be defined as in (5). Suppose Assumptions 2.1(i) and 2.3 hold. (i) If in addition Assumptions 2.4(i) and 2.5(a,c) hold, then we have ⇣ a⌘ ⇥ ⇤ e E Qa, (g; r) = h , r iI g; + O( 1 ), O( 1) is uniform over r 2 Zd . 13 (ii) If in addition Assumptions 2.4(ii) and 2.5(b,c) hold, then we have ✓ ◆ ⇣ a⌘ ⇥ ⇤ log + I(r 6= 0) log krk1 e E Qa, (g; r) = h , r iI g; +O . PROOF See Appendix B in (Subba Rao, 2015b). ⇤ We observe that by applying Theorem 3.2 to the case that h is uniform (using that 0 = 1 else ⇥ ⇤ 1 ) for r 6= 0. Hence, in the case that the sampling is uniform, e j = 0) gives E Qa, (g; r) = O( Theorems 3.1 and 3.2 give similar results, though the bounds in Theorem 3.1 are sharper. ⇥ ⇤ P e a, (g; 0) = h , Remark 3.1 (Estimation of j2Zd | j |2 ) The above lemma implies that E Q Therefore, to estimate I g; a we require an estimator of h , 0 i. To do this, we recall that Z X 1 2 h , 0i = | j| = d h (!)2 d!. j2Z [ 0 iI /2, /2]d Therefore one method for estimating the above integral is to define a grid on [ /2, /2]d and estimate h at each point, then to take the average squared over the grid (see Remark 1, Matsuda and Yajima (2009)). An alternative, computationally simpler method, is to use the method proposed in Gine and Nickl (2008), that is ✓ ◆ X sj 1 s j 2 2 2 \ h , 0i = K , n(n 1)b b 1j1 <j2 n as an estimator of h , 0 i, where K : [ 1/2, 1/2]d ! R is a kernel function. Note that multiplying the above kernel with exp( i! 0r sj2 ) results in an estimator of h , r i. In the case d = 1 and under certain regularity conditions, Gine and Nickl (2008) show if the bandwidth b is selected in an appropriate way then h\ , 0 i attains the classical O(n 1/2 ) rate under suitable regularlity conditions (see, also, Bickel and Ritov (1988) and Laurent (1996)). It seems plausible a similar result holds e a, (g; r)/h\ for d > 1 (though we do not prove it here). Therefore, an estimator of I g; a is Q , 0 i. 3.2 The covariance and asymptotic normality e a, (g; r) depends only the number of In the previous section we showed that the expectation of Q frequencies a through the limit of the integral I(g; a ) (if sup!Rd |g(!)| < 1). In this section, we e a, (g; r). We focus on the case that show that a plays a mild role in the higher order properties of Q the random field is Gaussian and later describe how the results di↵er in the case that the random field is non-Gaussian. Theorem 3.3 Suppose Assumptions 2.1, 2.3 and (i) Assumption 2.4(i) and 2.5(a,c) hold. Then uniformly for all 0  kr 1 k1 , kr 2 k1  C|a| (for some finite constant C) we have ✓ ◆ h i d 1 d e e cov Qa, (g; r 1 ), Qa, (g; r 2 ) = U1 (r 1 , r 2 ; ! r1 , ! r2 ) + O + n 14 g; a . and d ✓ i 1 e e cov Qa, (g; r 1 ), Qa, (g; r 2 ) = U2 (r 1 , r 2 ; ! r1 , ! r2 ) + O + h d n ◆ (ii) Assumption 2.4(ii) and Assumption 2.5(b) hold. Then we have h i h i d d e a, (g; r 1 ), Q e a, (g; r 2 ) < 1 and e a, (g; r 1 ), Q e a, (g; r 2 ) < 1, sup cov Q sup cov Q a,r 1 ,r 2 if d /n a,r 1 ,r 2 ! c (where 0  c < 1) as ! 1 and n ! 1. (iii) Assumption 2.4(ii) and 2.5(b,c) hold. Then uniformly for all 0  kr 1 k1 , kr 2 k1  C|a| (for some finite constant C) we have h i d e a, (g; r 1 ), Q e a, (g; r 2 ) = U1 (r 1 , r 2 ; ! r , ! r ) + O(` ,a,n ) cov Q 1 2 d where h i e a, (g; r 1 ), Q e a, (g; r 2 ) = U2 (r 1 , r 2 ; ! r , ! r ) + O(` cov Q 1 2 ` ,a,n 2 = log (a)  ,a,n ), d log a + log + n . (14) Note, if we drop the restriction on r 1 and r 2 and simply let r 1 , r 2 2 Zd then the bound for ` ,a,n needs to include the additional term log2 (a)(log kr 1 k1 + log kr 2 k1 )/ . U1 (·) and U2 (·) are defined as U1 (r 1 , r 2 ; ! r1 , ! r2 ) = U1,1 (r 1 , r 2 ; ! r1 , ! r2 ) + U1,2 (r 1 , r 2 ; ! r1 , ! r2 ) U2 (r 1 , r 2 ; ! r1 , ! r2 ) = U2,1 (r 1 , r 2 ; ! r1 , ! r2 ) + U2,2 (r 1 , r 2 ; ! r1 , ! r2 ) (15) with U1,1 (r 1 , r 2 ; ! r1 , ! r2 ) X 1 = j1 2⇡ j 1 +...+j 4 =r 1 r 2 U1,2 (r 1 , r 2 ; ! r1 , ! r2 ) X 1 = 2⇡ j2 j3 j4 Z D(j 1 +j 3 ) j1 j2 j3 j4 j 1 +j 2 +j 3 +j 4 =r 1 r 2 U2,1 (r 1 , r 2 ; ! r1 , ! r2 ) X 1 = 2⇡ 3 j1 j2 j3 j4 j 1 +j 2 +j 3 +j 4 =r 1 +r 2 U2,2 (r 1 , r 2 ; ! r1 , ! r2 ) X 1 = 2⇡ Z j1 j2 j3 j4 j 1 +j 2 +j 3 +j 4 =r 1 +r 2 where the integral is defined as R R 2⇡ min(a,a r1 )/ R 2⇡ min(a,a Dr = 2⇡ max( a, a r1 )/ . . . 2⇡ max( a, g(!)g( ! D(r1 Z Z g(!)g(! + ! j 1 +j 3 )f (! + ! j 1 )f (! + ! r1 j2 g(!)g(! j1 j3 g(!)g(! rd )/ a rd )/ j2 j 3 j 2 )f (! + ! j 1 )f (! + ! r1 j 2 )d! j3 ) D(j 1 +j 3 ) D(r1 ! r2 j 2 )d! j3 ) . 15 !)f (! + ! j 1 )f (! + ! r1 ! j 2 +j 3 r 1 )f (! j 2 )d! + ! j 1 )f (! + ! r1 j 2 )d!, ⇤ PROOF See Appendix B in Subba Rao (2015b). e a, (g; r) is a mean We now briefly discuss the above results. From Theorem 3.3(iii) we see that Q e a, (g; r) h , r iI(g; a )]2 = O( d +( log + squared consistent estimator of h , r iI(g; a/ ), i.e. E[Q 1 2 ! 1. Indeed, if sup!2Rd |g(!)| < 1, then the rate that a/ ! 1 plays no n ) ) as a ! 1 and role. However, in order to obtain an explicit expression for the variance additional conditions are required. In the case that the frequency grid is bounded we obtain an expression for the variance e a, (g; r). On the other hand, we observe (see Theorem 3.3(iii)) that if the frequency grid is of Q unbounded we require some additional conditions on the spectral density function and some mild constraints on the rate of grow of the frequency domain a. More precisely, a should be such that a = O( k ) for some 1  k < 1. If these conditions are fulfilled, then the asymptotic variance of e a, (g; r) (up to the limits of an integral) is the equivalent for both the bounded and unbounded Q frequency grid. By comparing Theorem 3.3(i) and (ii) we observe that in the case the frequency grid is bounded, the same result can be proved Assumption 2.5(b,c) rather than 2.5(a,c), however the error bounds would be slightly larger. The expressions above are rather unwieldy, however, some simplifications can be made if kr 1 k1 << and kr 2 k1 << . Corollary 3.1 Suppose Assumptions 2.3, 2.4 and 2.5(a,c) or 2.5(b,c) hold. Then we have ✓ ◆ kr 1 k1 + kr 2 k1 + 1 U1 (r 1 , r 2 ; ! r1 , !r2 ) = r 1 r 2 C1 + O ✓ ◆ kr 1 k1 + kr 2 k1 + 1 U2 (r 1 , r 2 ; ! r1 , !r2 ) = r 1 +r 2 C2 + O where r = P j 1 +j 2 +j 3 +j 4 =r C1 = C2 = j1 j2 j3 j4 Z 1 (2⇡)d 2⇡[ Z 1 (2⇡)d 2⇡[ and a/ ,a/ ]d h i f (!)2 |g(!)|2 + g(!)g( !) d! f (!)2 [g(!)g( !) + g(!)g(!)] d!. a/ ,a/ ]d i h i e a, (g; r 1 ), Q e a, (g; r 2 ) and d cov Q e a, (g; r 1 ), Q e a, (g; r 2 ) Q e a, (g; r) can easily be deduced. the variance of the real and imaginary parts of Q e a, (g; r), which are subseIn the following theorem we derive bounds for the cumulants of Q e a, (g; r). quently used to show asymptotical normality of Q Using the above expressions for d cov h Theorem 3.4 Suppose Assumptions 2.1, 2.3, 2.4 and 2.5(b) hold. Then for all q in r 1 , . . . , r q 2 Zd we have ✓ 2d(q 2) ◆ h i log (a) e e cumq Qa, (g, r 1 ), . . . , Qa, (g, r q ) = O d(q 1) 16 3 and uniform (16) if d n log2d (a) ! 0 as n ! 1, a ! 1 and ! 1. ⇤ PROOF See Appendix B in Subba Rao (2015b). d From the above theorem we see that if n log2d (a) ! 0 and log2 (a)/ 1/2 ! 0 as e a, (g, r 1 ), . . . , Q e a, (g, r q )) ! 0 for all q and a ! 1, then we have dq/2 cumq (Q e a, (g, r). result we show asymptotic Gaussianity of Q ! 1, n ! 1 3. Using this Theorem 3.5 Suppose Assumptions 2.1, 2.3, 2.4 and 2.5(b,c) hold. Let C1 and C2 , be defined as in Corollary 3.1. Under these conditions we have 0 ⇣ ⌘ 1 e a, (g, r 1 ) h , r iI(g; a ) < Q D d/2 1/2 @ ⇣ ⌘ A! N (0, I2 ), a e = Qa, (g, r 1 ) h , r iI(g; ) where <X and =X denote the real and imaginary parts of the random variable X, and ! <( 0 C1 + 2r C2 ) =( 0 C1 + 2r C2 ) 1 = 2 =( 0 C1 + 2r C2 ) <( 0 C1 2r C2 ) with log2 (a) 1/2 ! 0 as ! 1, n ! 1 and a ! 1. ⇤ PROOF See Appendix D, Subba Rao (2015b). It is likely that the above result also holds when the assumption of Gaussianity of the spatial random field is relaxed and replaced with the conditions stated in Theorem 3.6 (below) together with some mixing-type assumptions. We leave this for future work. However, in the fole a, (g; r) for non-Gaussian random lowing theorem, we obtain an expression for the variance of Q fields. We make the assumption that {Z(s); s 2 Rd } is fourth order stationary, in the sense that E[Z(s)] = µ, cov[Z(s1 ), Z(s2 )] = c(s1 s2 ), cum[Z(s1 ), Z(s2 ), Z(s3 )] = 2 (s1 s2 , s1 s3 ) and cum[Z(s1 ), Z(s2 ), Z(s3 ), Z(s4 )] = 4 (s1 s2 , s1 s3 , s1 s4 ), for some functions 3 (·) and 4 (·) and all s1 , . . . , s4 2 Rd . Theorem 3.6 Let us suppose that {Z(s); s 2 Rd } is a fourth order stationary spatial random field that satisfies Assumption 2.1(i). Furthermore, we suppose that Assumptions 2.3, 2.4 and 2.5(a,c) or R P 2.5(b,c) are satisfied. Let f4 (! 1 , ! 2 , ! 3 ) = R3d 4 (s1 , s2 , s3 ) exp( i 3j=1 s0j ! j )d! 1 d! 2 d! 3 . We assume that for some > 0 the spatial tri-spectral density function is such that |f4 (! 1 , ! 2 , ! 3 )|  1 ,...,!3d ) (! 1 ) (! 2 ) (! 3 ) and | @f4 (!@! |  (! 1 ) (! 2 ) (! 3 ) (where (·) is defined in (6)). j If krk1 , krk2 << , then we have ✓ ◆ h i d d e a, (g; r 1 ), Q e a, (g; r 2 ) = r r (C1 + D1 ) + O ` ,a,n + (a ) + kr 1 k1 + kr 2 k1 cov Q (17) 1 2 n2 d h i e a, (g; r 1 ), Q e a, (g; r 2 ) = cov Q ✓ (a )d kr 1 k1 + kr 1 k1 + r 1 +r 2 (C2 + D2 ) + O ` ,a,n + n2 17 ◆ , (18) where C1 and C2 are defined as in Corollary 3.1 and Z 1 D1 = g(! 1 )g(! 2 )f4 ( ! 1 , ! 2 , ! 2 )d! 1 d! 2 (2⇡)2d 2⇡[ a/ ,a/ ]2d Z 1 D2 = g(! 1 )g(! 2 )f4 ( ! 1 , ! 2 , ! 2 )d! 1 d! 2 . (2⇡)2d 2⇡[ a/ ,a/ ]2d Note that we can drop the restriction that krk1 , krk2 << and let r 1 , r 2 2 Zd , under this more general assumption we obtain expressions that are similar to those for U1 and U2 (given in Theorem 3.3) and we need to include log2 (a)[krk1 + kr 2 k1 ]/ in the bounds. PROOF See Appendix B in Subba Rao (2015b). d We observe that to ensure the term (an2) ! 0 we need to choose a such that ad = o(n2 / d ). We recall that for Gaussian random fields a = O( k ) for some 1  k < 1 was sufficient for obtaining an expression for the variance and asymptotic normality. However, in contrast, in the case that the d spatial random field is non-Gaussian stronger assumptions are required on a (indeed if (an2) ! 1 e a, (g; r)] is not bounded). the variance d var[Q 3.3 Alternative frequency grids and asymptotics e a, In this subsection we discuss two issues related to the estimator Q 3.3.1 ,↵ (g; r). Frequency grids In a recent paper, Bandyopadhyay et al. (2015) define the spatial empirical likelihood within the frequency domain. They use the spatial ‘periodogram’ {|Jn (! ↵,k )|2 ; k 2 Zd and k 2 C[ ⌘ , ⌘ ]d } ↵ with ! ↵,k = 2⇡ k as the building blocks of the empirical likelihood. The way in which they define their frequency grid is di↵erent to the frequency grid defined in this paper. More precisely, e a, (g; r), Bandyopadhyay whereas {! k = 2⇡k ; k = (k1 , . . . , kd ), a  kj  a} is used to construct Q ↵ ⌘  k  ⌘ } where et al. (2015) use the frequency grid {! ↵,k = 2⇡ k ; k = (k1 , . . . , kd ), j 0 < (1 ⌘) < ↵ < 1. In other words, by using ! ↵,k to define their frequency grid they ensure that the frequencies are sufficiently far apart to induce the near uncorrelatedness (indeed near independence) of the {|Jn (! ↵,k )|2 ; k 2 Zd and k 2 C[ ⌘ , ⌘ ]d } even in the case that the distribution of the locations in not uniform (see the discussion below Theorem 2.2). This construction is crucial in ensuring that the statistics used to define the empirical likelihood are asymptotically pivotal. e a, (g; r) under the {! ↵,k = 2⇡ ↵ k ; k = In this section we consider the sampling properties of Q e is (k1 , . . . , kd ), a  kj  a} frequency grid. To do this, we note that under this frequency grid Q defined as e a, Q ,↵ (g; r) = 1 d(1 ↵) a X d↵ g(! ↵,k )Jn (! ↵,k )Jn (! ↵,k+r ) k1 ,...,kd = a n a X k= a n 1X g(! ↵,k ) Z(sj )2 exp( is0j ! ↵,r ),(19) n j=1 e a, ,↵ (g; r) with Q e a, (g; r) we see that they are the same where we assume ↵ 2 Z+ . Comparing Q when we set ↵ = 0. The methods developed in this paper can easily be extended to the analysis of 18 e a, Q ,↵ (g; r). More precisely, using the same method discussed in Section 3.1 it can be shown that h i e a, ,↵ (g; r) E Q ◆ Z 1 ✓ a X c2 X 1 2y = g(! ↵,k ) f ! ↵,k Sinc(y)Sinc(y + ↵ (r j 1 j 2 )⇡)dy. j 1 j 2 d(1 ↵) ⇡d 1 j 1 ,j 2 2Z k= a And by using the same method used to prove Lemma 3.2 we have ✓ ⇣ ⌘ ⇥ ⇤ a log + log krk1 e ,a,↵ (g; r) = h , ↵ r iI E Q g; 1 ↵ + O + Similarly it can be shown that h d(1 ↵) e a, cov Q = ↵ (r 1 e ,↵ (g; r 1 ), Qa, ,↵ (g; r 2 ) ✓ e r 2 ) C1 + O `  i 1 1 ↵ ◆ . (a )d 1 kr 1 k1 + kr 2 k1 + 1 + ↵ IS=N G + ,a,n + 1 ↵ n2 (20) ◆ (21) where IS=N G denotes the indicator variable, such that IS=N G = 1 if the spatial process {Z(s); s 2 e1 is defined in exactly the same way as C1 , but the integral Rd } is non-Gaussian else it is zero, and C R R 2⇡[ a/ 1 ↵ ,a/ 1 ↵ ]d replaces 2⇡[ a/ ,a/ ]d . Note, when using this grid, for non-Gaussian random fields the the term involving the fourth order cumulant is asymptotically negligible (however, this is not true forhthe case ↵ = i 0). Bandyopadhyay et al. (2015), Lemma 7.4, prove the same result e a, ,↵ (g; r) . for d(1 ↵) var Q In many respects, the results above are analogous to those in the case that the locations come from the uniform distribution (see Section 3.4, below). More precisely, by using a frequency grid e a, ,↵ (g; 0) is a consistent estimator of where the frequencies are defined sufficiently far apart Q ⇣ a ⌘ h , 0 iI g; 1 ↵ e a, ,↵ (g; r) is a consistent estimator of zero. Furthermore, {Q e a, ,↵ (g; r); r 2 whereas for r 6= 0, Q Zd } is a near uncorrelated sequence. However, unlike the case that the locations are uniformly e a, ,↵ (g; r); r 6= 0} can be exploited to estimate the variance of distributed it is unclear how {Q e a, ,↵ (g; 0). Q 3.3.2 Mixed Domain verses Pure Increasing Domain asymptotics The asymptotics in this paper are done using mixed domain asymptotics, that is, as the domain ! 1, the number of locations observed grows at a faster rate than , in other words d /n ! 0 as n ! 1. However, as rightly pointed out by a referee, for a given application it may be difficult to disambiguate Mixed Domain (MD) from the Pure Increasing Domain (PID), where d /n ! c (0 < c < 1) set-up. We briefly discuss how the results change under PID asymptotics and the implications of this. We find that the results point to a rather intriguing di↵erence for spatial processes that are Gaussian and non-Gaussian. 19 d As mentioned above, in the case h of PID asymptotic i /n is not asymptotically negligible. This e a, (g; r 1 ), Q e a, (g; r 2 ) will involve extra terms. These additional means that the covariance d cov Q terms can be deduced from the proof, but are long so we do not state them here. Interestingly, in the case that the locations are uniformly distributed the procedure described in Section 5 can still be used to estimate the variance without any adjustments to the method (in other words, this method is robust to the asymptotics used). This brings us to the most curious di↵erence between the MID and PD case. Comparing Theorems 3.3 and 3.6 we observe the following: • In the case that spatial process is Gaussian, using both MD and PID asymptotics we have d var[Q e a, (g; r)] = O(1) (see Theorem 3.3(i)). Furthermore, an asymptotic expression for the variance is ✓  ◆ krk1 + 1 d e a, (g; r)] = 0 [C1 + E1 ] + O log2 (a) log a + log var[Q + where E1 is the additional term that is asymptotically not negligible under PID asymptotics; E = O( d /n). From the above we see that if we choose a such that a = O( k ) for some 1 < k < 1 then the frequency grid is unbounded and similar results as those stated in Sections 3.1 and 3.2 hold under PID asymptotics. • In the case that the process is non-Gaussian, using Theorem 3.6 we have ✓  ◆ log a + log (a )d krk1 + 1 d 2 e var[Qa, (g; r)] = 0 [C1 + D1 + E1 + F1 ] + O log (a) + + , n2 where E1 + F1 are asymptotically negligible under MD asymptotics but not negligible under PID asymptotics; E1 +F1 = O( d /n). Now a plays a vital role in the properties of the variance. d Indeed from the proof of Theorem 3.6, we can show that if (an2) ! 1 as a, , n ! 1, then d var[Q e a, (g; r)] is not bounded. In the case of MD asymptotics we choose a such that (a )d /n2 ! 0 and log3 (a)/ ! 1. Under these two conditions the frequency grid can be unbounded and grow at the rate a/ as ! 1. However, under PID asymptotics (where = O(n1/d )) in order to ensure that (a )d /n2 = O(1) we require a = O(n1/d ) = O( ). This constrains the frequency grid to be bounded. In other words, in the case that the spatial process is non-Gaussian, in order that e a, (g; r)] = O( d ), the frequency grid must be bounded. var[Q However, it is interesting to note that it is still possible to have an unbounded frequency grid e a, ,↵ (g; r) for non-Gaussian random processes under PID asymptotics if we use the estimator Q (defined in (19) with 0 < ↵ < 1). In this case let a = O( ) (thus the condition (a )d /n2 = ↵ O(1) is satisfied), which gives rise to the frequency grid {! ↵,k = 2⇡ k ; k = (k1 , . . . , kd ),  1 ↵ ↵ kj  }, thus the frequency grid grows at the rate a/ = . Using this estimator we d e have var[Qa, (g; r)] = O(1). We note that these conditions are very similar to those given in Bandyopadhyay et al. (2015) (see condition C.4(i)). 20 3.4 Uniformly sampled locations In many situations it is reasonable to suppose that the locations are uniformly distributed over a region. In this case, many of the results stated above can be simplified. These simplifications allow us to develop a simple method for estimating nuisance parameters in Section 5. dI We first recall that if the locations are uniformly sampled then h (s) = [ /2, /2]d (s), therefore the Fourier coefficients of h (·) are simply 0 = 1 and j = 0 if j 6= 0. This implies that d r (defined in Corollary 3.1) is such that r = 0 if r 6= 0 and 0 = 1. Therefore if {Z(s); s 2 R } is a stationary spatial random process that satisfies the assumptions in Theorem 3.6 (with uniform sampling of the locations) and krk1 << 8 ⇣ ⌘ h i < 1 [C1 + D1 ] + O ` ,a,n + (a )2 I2S=NG + krk1 r 1 = r 2 (= r) 2 d n e a, (g; r 1 ), <Q e a, (g; r 2 ) = cov <Q , 2 : O(` ,a,n + (a ) nI2S=NG ) r 1 6= r 2 or r2 8 ⇣ ⌘ h i < 1 [C1 + D1 ] + O ` ,a,n + (a )2 I2S=NG + krk1 r 1 = r 2 (= r) 2 d n e a, (g; r 1 ), =Q e a, (g; r 2 ) = cov =Q (22) 2 : O(` ,a,n + (a ) I2S=NG ) r 1 6= r 2 or r2 n h i e a, (g; r 1 ), =Q e a, (g; r 2 ) = O(` ,a,n + (a )2 I2S=NG ), where IS=NG denotes the indicator and d cov <Q n variable and is one if the random field is Gaussian, else it is zero (see Appendix A.2). From the e a, (g; r), =Q e a, (g; r); r 2 Zd } is a ‘near uncorrelated’ sequence where in above we observe that {<Q e a, (g; r) and =Q e a, (g; r) are consistent estimators of zero. It is this property we the case r 6= 0, <Q e a, (g; 0). Below we show asymptotic normality exploit in Section 5 for estimating the variance of Q in the case of uniformly distributed locations. Theorem 3.7 [CLT on real and imaginary parts] Suppose Assumptions 2.1, 2.2, 2.5(b,c) and 2.4(i) or 2.4(ii) hold. Let C1 and C2 , be defined as in Corollary 3.1. We define the m-dimension e m = (Q e a, (g, r 1 ), . . . , Q e a, (g, r m )), where r 1 , . . . , r m are such that r i 6= complex random vectors Q r j and r i 6= 0. Under these conditions we have ✓ ◆ 2 d/2 C1 P e e e <Qa, (g, 0), <Qm , =Qm ! N 0, I2m+1 (23) C1 C1 + <C2 with log2 (a) 1/2 ! 0 as ! 1, n ! 1 and a ! 1. ⇤ PROOF See Appendix D, Subba Rao (2015b). 4 Statistical inference In this section we apply the results derived in the previous section to some of the examples discussed in Section 2.2. We will assume the locations are uniformly distributed and the random field is Gaussian and Assumption 2.5(b,c) is satisfied. All the statistics considered below are bias corrected and based e a, (g; 0). Since we make a bias correction, let b2 = 1 Pn Z(sj )2 . on Q n j=1 n In Appendix F, Subba Rao (2015b), we state the results for the non-biased corrected statistics. 21 4.1 The Whittle likelihood We first consider the Whittle likelihood described in Example 2.1(i), but with the ‘bias’ term removed. Let ✓en = arg min✓2⇥ Len (✓), where Len (✓) = 1 d ✓ C X k= C |Jn (! k )|2 log f✓ (! k ) + f✓ (! k ) e a, (f✓ (! k ) = Q 1 C X ; 0) + ◆ C bn2 X 1 n f✓ (! k ) k= C log f✓ (! k ) k= C and ⇥ is a compact parameter space. Let ✓0 = arg min✓2⇥ L(✓), where  Z 1 f (!) L(✓) = log f✓ (!) + d!. d f✓ (!) (2⇡) 2⇡[ C,C]d We will assume that ✓0 is such that for all ! 2 Rd , f (!) = f✓0 (!), where f (!) denotes the true spectral density of the stationary spatial random process and there does not exist another ✓ 2 ⇥ such that for some ! 2 2⇡[ C, C]d f✓0 (!) = f✓ (!). Under the assumption that sup✓2⇥ sup!22⇡[ C,C]d |r✓ f✓ (!) 1 and sup✓2⇥ sup!22⇡[ C,C]d |r3✓ f✓ (!) 1 | < 1 and by using Theorem 3.1(i) and (22) we can show equicontinuity in probability of Len (✓) and r3✓ Len (✓). These two conditions imply consistency i.e. P P e ✓¯n ) ! ✓en ! ✓0 and r2✓ L( V for any ✓¯n that lies between ✓en and ✓0 and Z 2 V = [r✓ log f✓ (!)] [r✓ log f✓ (!)]0 c✓=✓0 d!. (2⇡)d 2⇡[ C,C]d Using these preliminary results we now derive the limiting distribution of ✓en . By making a Taylor expansion of Len about ✓0 is clear that the asymptotic sampling results are determined by the the first derivative of Len (✓) at the true parameter e a, (r✓ f✓ (·) r✓ Len (✓0 ) = Q 0 1 ; 0) + a 1 X d k= a 1 r✓ f✓0 (! k ), f✓0 (! k ) where the second term on the right hand side of the above is deterministic and e a, (r✓ f✓ (·) Q 0 1 ; 0) = 1 d C X k= C r✓ f✓ (! k ) e a, (r✓ f✓ (·) By applying Theorem 3.7 to Q 0 gives d/2 2 (✓en 1 ; 0), 1 |Jn (! k )|2 we have P ✓0 ) ! N (0, V d/2 r 1 a bn2 X r✓ f✓ (! k ) n 1 . k= a e ✓ Ln (✓) D ! N (0, V ). Altogether this ), (a) with d /n ! 0 and log1/2 ! 0 as a ! 1 and ! 1. In the non-Gaussian case the limiting 1 variance will change to V W V 1 , where an expression for W can be deduced from Theorem A.2. Note that the asymptotic sampling properties of the discretized Whittle likelihood estimator ✓en are identical to the asymptotic sampling properties of the integrated Whittle likelihood estimator derived in Matsuda and Yajima (2009). 22 1| < 4.2 The nonparametric covariance estimator We now consider the bias corrected version of nonparametric estimator defined in Example 2.2(ii). Let ✓ ◆ 2v b cn (v) = T e cn (v) ! a a 2 X 1 X b n where e cn (v) = |Jn (! k )|2 exp(iv 0 ! k ) exp(iv 0 ! k ) , d n k= a k= a and T is the d-dimensional triangle kernel. It is clear the asymptotic sampling properties of b cn (v) 0 iv e are determined by e cn (v). Therefore, we first consider e cn (v). We observe that e cn (v) = Qa, (e · ; 0), thus we use the results in Section 3 to derive the asymptotic sampling properties of e cn (v). By using Theorem 3.1(ii) and Assumption 2.5 we have ✓ ◆ ⇣ log a⌘ E[e cn (v)] = I eiv· ; +O ! ! ✓ ◆ ✓ ◆ Z 1 log log = f (!) exp(iv 0 !)d! + O + = c(v) + O + , (24) a a (2⇡)d Rd where we use the condition f (!)  (!) to replace I eiv· ; a with I eiv· ; 1 . Therefore, if +d/2 /a ! 0, d /n ! 0 as a ! 1 and ! 1, then by using Theorem 3.7 we have d/2 D c(v)] ! N (0, (v)2 ) [e cn (v) (25) where (v) 2 = 1 (2⇡)d Z 2⇡[ a/ ,a/ ]d ⇥ ⇤ f (!)2 1 + exp(i2v 0 !) d!. We now derive the sampling properties of b cn (v). By using properties of the triangle kernel we have ✓ ✓ ◆ ◆ log min1id |vi | max1id |vi | E[b cn (v)] = c(v) + O + 1 + a and d/2 d ⇥ b cn (v) T T 2v ⇤ c(v) D ! N (0, 1), (v) 2v 2 (a) with +d/2 /a ! 0, n ! 0 and log1/2 ! 0 as a ! 1 and ! 1. We mention that the Fourier transform of the bias corrected nonparametric covariance estimator R P 0 is Rd e cn (v)e iv ! dv = ak= a |Jn (! k )|2 bn2 Sinc2 [ (! k !)]. Therefore, when a bias correction is made the corresponding sample covariance may not be a non-negative definite function. The sampling properties of the non-negative definite nonparametric covariance function estimator defined in Example 2.2(ii), can be found in Appendix F, Subba Rao (2015b). 23 4.3 Parameter estimation using an L2 criterion In this section we consider the asymptotic sampling properties of the parameter estimator using the L2 criterion. We will assume that there exists a ✓0 2 ⇥, such that for all ! 2 Rd , f✓0 (!) = f (!) and there does not exist another ✓ 2 ⇥ such that for all ! 2 Rd f✓0 (!) = f✓ (!). In addition, we assume R sup✓2⇥ Rd kr✓ f (!; ✓)k21 d! < 1, sup✓2⇥ sup!2Rd |r✓ f✓ (!)| < 1 and sup✓2⇥ sup!2Rd |r3✓ f✓ (!)| < 1. We consider the bias corrected version of the L2 criterion given in Example 2.2(iii). That is Ln (✓) = 1 d a X k1 ,...,kd = a |Jn (! k )|2 bn2 f✓ (! k ) 2 , and let ✓bn = arg min✓2⇥ Ln (✓) (⇥ is a compact set). By adapting similar methods to those developed in this paper to quartic forms we can show that R R P pointwise Ln (✓) ! L(✓), where L(✓) = Rd [f (!) f✓ (!)]d d! + Rd f (!)2 d!, which is minimised at P ✓0 . Under the stated conditions we can show uniform convergence of Ln (✓), which implies ✓bn ! ✓0 . Using this result the sampling properties of ✓bn are determined by r✓ Ln (✓) (given in equation (11)) e a, ( 2r✓ f✓ (·); 0), where and in particular its random component Q r✓ Ln (✓) = 2 d a X k1 ,...,kd = a <r✓ f✓ (! k ) |Jn (! k )| 2 f✓ (! k ) + 2bn2 d a X k1 ,...,kd = a a X e a, ( 2r✓ f✓ (·); 0) + 2 = Q f✓ (! k )r✓ f✓ (! k ). d <r✓ f✓ (! k ) k= a Using the assumptions stated above and Theorem 3.3 we can show that for any ✓¯n between ✓0 P and ✓bn we have r2✓ Ln (✓¯n ) ! A where Z 2 A= [r✓ f (!; ✓0 )][r✓ f (!; ✓0 )]0 d!, (26) (2⇡)d Rd as d /n ! 0 with ! 1 and n ! 1. Thus d/2 (✓bn ✓0 ) = A 1 d/2 rLn (✓0 ) + op (1), and it is e a, ( 2r✓ f✓ (·); 0) + clear the asymptotic sampling properties of ✓bn are determined r✓ Ln (✓0 ) = Q 0 1 Pa 2 r f (! ) . k d k= a ✓ ✓0 D Thus by using Theorem 3.7 we have d/2 r✓ Ln (✓0 ) ! N (0, B), where Z 4 B = f (!)2 [r✓ f✓ (!)] [r✓ f✓ (!)]0 c✓=✓0 d!. (2⇡)d Rd Therefore, by using the above we have d/2 with d n ! 0 and log2 (a) 1/2 (✓bn P ✓0 ) ! N (0, A ! 0 as a ! 1 and ! 1. 24 1 BA 1 ) 5 ea, (g; 0) An estimator of the variance of Q e a, (g; 0) given in the examples above, is rather unwieldy and The expression for the variance Q difficult to estimate directly. For example, in the case that the random field is Gaussian, one P can estimate C1 by replacing the integral with the sum ak= a and the spectral density function with the squared bias corrected periodogram (|Jn (! k )|2 bn2 )2 (see Bandyopadhyay et al. (2015), e a, ,↵ is Lemma 7.5, though their estimator also holds for non-Gaussian spatial processes when Q defined using the frequency grid described in Section 3.3.1). In this section we describe an alternative, very simple method for estimating the variance of e Qa, (g; 0) under the assumption the locations are uniformly distributed. For ease of presentation we will assume the spatial random field is Gaussian. However, exactly the same methodology holds when the spatial random fields is non-Gaussian. The methodology is motivated by the method of orthogonal samples for time series proposed in Subba Rao (2015a), where the idea is to define a sample which by construction shares some of the properties of the statistic of interest. In this e a, (g; r); r 6= 0} is an orthogonal sample associated with Q e a, (g; r). section we show that {Q e a, (g; 0). By using Theorem 3.1 we have E[Q e a, (g; 0)] = We first focus on estimating the variance of Q I g; a + o(1) and d e a, (g; 0)] = C1 + O(` var[Q ,a,n ). e a, (g; r)] = In contrast, we observe that if no elements of the vector r are zero, then by Theorem 3.1 E[Q Qd O( i=1 [log + log |ri |]/ d ) (slightly slower rates are obtained when r contains some zero). In other e a, (g; r) is estimating zero. On the other hand, by (22) we observe that words, Q ⇣ ⌘ ( krk1 1 h i C + O ` + r 1 = r 2 (= r) 1 ,a,n d 2 e a, (g; r 1 ), <Q e a, (g; r 2 ) = . cov <Q O (` ,a,n ) r 1 6= r 2 , r 1 6= r 2 e a, (g; r)}, furthermore we have d cov[<Q e a, (g; r 1 ), =Q e a, (g; r 2 )] = A similar result holds for {=Q O(` ,a,n ). e a, (g; r), =Q e a, (g; r)} are ‘near uncorrelated’ In summary, if krk1 is not too large, then {<Q p e a, (g; 0)/ 2. This suggests that random variables whose variance is approximately the same as Q e a, (g; r), =Q e a, (g; r); r 2 S} to estimate var[Q e a, (g; 0)], where the set S is defined we can use {<Q as S = {r; krk1  M, r 1 6= r 2 and all elements of r are non-zero} . This leads to the following estimator ⌘ d X⇣ d X e a, (g; r)|2 + 2|=Q e a, (g; r)|2 = e a, (g; r)|2 , Ve = 2|<Q |Q 2|S| |S| r2S (27) (28) r2S where |S| denotes the cardinality of the set S. Note that we specifically select the set S such that no e a, (g; r)] is small and does not bias the variance element r contains zero, this is to ensure that E[Q estimator. In the following theorem we obtain a mean squared bound for Ve . 25 Theorem 5.1 Let Ve be defined as in (28), where S is defined in (27). Suppose Assumptions 2.1, 2.2 and 2.5(a,b,c) hold and either Assumption 2.4(i) or (ii) holds. Then we have ⇣ ⌘2 d e a, (g; 0)] = O(|S| 1 + |M | 1 + ` ,a,n + d log4d (a)) E Ve var[Q as ! 1, a ! 1 and n ! 1 (where `a, ,n is defined in (35)). ⇤ PROOF. See Subba Rao (2015b), Section E. Thus it follows from the above result that if the set S grows at a rate such that |M | 1 ! 0 as e a, (g; 0)]. ! 1, then Ve is a mean square consistent estimators of d var[Q We note that the same estimator can be used in the case that the random field is non-Gaussian, in this case to show consistency we need to place some conditions on the higher order cumulants of the spatial random process. However, it is a trickier to relax the assumption that the locations e a, (g; r)] (r 6= 0) are uniformly distributed. This is because in the case of a non-uniform design E[Q will not, necessarily, be estimating zero. We note that the same method can be used to estimate the variance of the non-bias corrected statistic Qa, (g; 0). Acknowledgments This work has been partially supported by the National Science Foundation, DMS-1106518 and DMS-1513647. The author gratefully acknowledges Gregory Berkolaiko for many very useful discussions. Furthermore, the author gratefully acknowledges the editor, associate editor and three anonymous referees who made substantial improvements to every part of the manuscript. A A.1 Appendix Proof of Theorem 3.1 It is clear from the motivation at the start of Section 3.1 that the sinc function plays an important e a, (g; r). Therefore we now summarise some of its properties. role in the analysis of Qa, (g; r) and Q R /2 ! It is well known that 1 /2 exp(ix!)dx = sinc( 2 ) and Z 1 Z 1 sinc(u)du = ⇡ and sinc2 (u)du = ⇡. (29) 1 1 We next state a well know result that is an important component of the proofs in this paper. Lemma A.1 [Orthogonality of the sinc function] Z 1 sinc(u)sinc(u + x)du = ⇡sinc(x) (30) 1 and if s 2 Z/{0} then Z 1 sinc(u)sinc(u + s⇡)du = 0. 1 26 (31) ⇤ PROOF In Appendix C, Subba Rao (2015b). We use the above results in the proof below. PROOF of Theorem 3.1. We first prove (i). By using the same proof used to prove Theorem 2.1 (see Bandyopadhyay and Subba Rao (2015), Theorem 1) we can show that 3 ( 2 n d X 0 f (! k ) + O( 1 ) k1 = k2 (= k), E 4Jn (! k1 )Jn (! k2 ) Z(sj )2 eisj (!k1 !k2 ) 5 = , n O( d1 b ) k1 k2 6= 0. j=1 where b = b(k1 k2 ) denotes the number of zero elements in the vector r. Therefore, since a = O( ), e a, (g; r) we use the above to give by taking expectations of Q ( h i 1 Pa 1 d k= a g(! k )f (! k ) + O( ) r = 0 e a, (g; r) = E Q . O( d1 b ) r 6= 0 Therefore, by replacing the summand with the integral we obtain (12). The above method cannot be used to prove (ii) since a/ ! 1, this leads to bounds which may not converge. Therefore, as discussed in Section 3.1 we consider an alternative approach. To e a, (g; r) as a quadratic form to give do this we expand Q e a, (g; r) = Q n a 1 X X g(! k )Z(sj1 )Z(sj2 ) exp(i! 0k (sj1 n2 j ,j =1 1 2 j1 6=j2 sj2 )) exp( i! 0r sj2 ). k= a Taking expectations gives a h i X ⇥ e a, (g; r) = c2 E Q g(! k )E c(s1 s2 ) exp(i! 0k (s1 s2 ) k= a where c2 = n(n 1)/2. In the case that d = 1 the above reduces to a h i X e a, (g; r) = c2 E Q g(!k )E [c(s1 s2 ) exp(i!k (s1 s2 ) is2 !r )] = c2 k= a a X 2 k= a g(!k ) Z /2 /2 Z is02 ! r ) ⇤ /2 c(s1 s2 ) exp(i!k (s1 s2 ) is2 !r )ds1 ds2 . (32) /2 Replacing c(s1 s2 ) with the Fourier representation of the covariance function gives ✓ ◆ ✓ ◆ Z 1 a h i c2 X ! ! e E Qa, (g; r) = g(!k ) f (!)sinc + k⇡ sinc + (k + r)⇡ d!. 2⇡ 2 2 1 k= a By a change of variables y = !/2 + k⇡ and replacing the sum with an integral we have Z 1 a h i c2 X 2y e E Qa, (g; r) = g(!k ) f( !k )sinc(y)sinc(y + r⇡)dy ⇡ 1 k= a ✓ X ◆ Z a c2 1 1 2y = sinc(y)sinc(y + r⇡) g(!k )f ( !k ) dy ⇡ 1 k= a ✓ ◆ Z 1 Z 2⇡a/ c2 2y 1 = sinc(y)sinc(y + r⇡) g(u)f ( u)dudy + O , 2⇡ 2 1 2⇡a/ 27 where the O( f ( u) gives h 1) e a, (g; r) E Q i comes from Lemma E.1(ii) in Subba Rao (2015b). Replacing f ( 2y = c2 2⇡ 2 Z 1 sinc(y)sinc(y + r⇡) 1 Z 2⇡a/ g(u)f ( u)dudy + Rn + O( u) with 1 ), 2⇡a/ where Rn = c2 2⇡ 2 Z 1 sinc(y)sinc(y + r⇡) 1 ✓Z 2⇡a/ ✓ g(u) f ( 2⇡a/ 2y u) ◆ f ( u) dudy. By using Lemma C.2 in Subba Rao (2015b), we have |Rn | = O( log +I(r6=0) log |r| ). Therefore, by using Lemma A.1, replacing c2 with one (which leads to the error O(n 1 )) and (29) we have h i e a, (g; r) E Q ✓ ◆ Z 1 Z 2⇡a/ c2 log + I(r 6= 0) log |r| = sinc(y)sinc(y + r⇡) g(u)f ( u)dudy + O 2⇡ 2 1 2⇡a/ ✓ ◆ Z I(r = 0) 2⇡a/ 1 log + I(r 6= 0) log |r| = g(u)f (u)dudy + O + , 2⇡ 2⇡a/ which gives (13) in the case d = 1. To prove the result for d > 1, we will only consider the case d = 2, as it highlights the di↵erence from the d = 1 case. By substituting the spectral representation into (12) we have h i e a, (g; (r1 , r2 )) E Q ✓ ◆ Z a X c2 2u1 2u2 = g(!k1 , !k2 ) f ! k1 , !k2 sinc(u1 )sinc(u1 + r1 ⇡)sinc(u2 )sinc(u2 + r2 ⇡)du1 du2 ⇡2 2 R2 k1 ,k2 = a . In the case that either r1 = 0 or r2 = 0 we can use the same proof given in the case d = 1 to give Z 1 e a, (g; r)] = E[Q g(!1 , !2 )f (!1 , !2 ) (2⇡)2 ⇡ 2 2 [ a/ ,a/ ]2 Z sinc(u1 )sinc(u1 + r1 ⇡)sinc(u2 )sinc(u2 + r2 ⇡)du1 du2 + Rn , Rd where |Rn | = O( log +I(krk1 6=0) log krk1 + n 1 ), which gives the desired result. However, in the case that both r1 6= 0 and r2 6= 0, we can use Lemma C.2, equation (49), Subba Rao (2015b) to obtain ✓ ◆ Z a h i X c2 2u1 2u2 e E Qa, (g; (r1 , r2 )) = 2 2 g(!k1 , !k2 ) f ! k1 , ! k2 ⇥ ⇡ Rd k1 ,k2 = a ! Q2 [log + log |r |] i i=1 sinc(u1 )sinc(u1 + r1 ⇡)sinc(u2 )sinc(u2 + r2 ⇡)du1 du2 = O 2 thus we obtain the faster rate of convergence. It is straightforward to generalize these arguments to d > 2. 28 ⇤ A.2 Variance calculations in the case of uniformly sampled locations We start by proving the results in Section 3 for the case that the locations are uniformly distributed. The proof here forms the building blocks for the case of non-uniform sampling of the locations, which can found in Section B, Subba Rao (2015b). In Lemma E.2, Appendix E, Subba Rao (2015b) we show that if the spatial process is Gaussian and the locations are uniformly sampled then for a bounded frequency grid we have ( d h i C1 (! r ) + O( 1 + n ) r 1 = r 2 (= r) d e e cov Qa, (g; r 1 ), Qa, (g; r 2 ) = d O( 1 + n ) r 1 6= r 2 and d where h i e a, (g; r 1 ), Q e a, (g; r 2 ) = cov Q ( C2 (! r ) + O( 1 + d O( 1 + n ) d n ) r1 = r 1 6= r 2 (= r) r2 C1 (! r ) = C1,1 (! r ) + C1,2 (! r ) and C2 (! r ) = C2,1 (! r ) + C2,2 (! r ), (33) with C1,1 (! r ) = C1,2 (! r ) = C2,1 (! r ) = C2,2 (! r ) = Z 1 f (!)f (! + ! r )|g(!)|2 d!, (2⇡)d 2⇡[ a/ ,a/ ]d Z 1 f (!)f (! + ! r )g(!)g( ! ! r )d!, (2⇡)d Dr Z 1 f (!)f (! + ! r )g(!)g( !)d!, (2⇡)d 2⇡[ a/ ,a/ ]d Z 1 f (!)f (! + ! r )g(!)g(! + ! r )d!. (2⇡)d Dr (34) On the other other hand, if the frequency grid is unbounded, then under Assumptions 2.4(ii) and 2.5(b) we have d d where A1 (r 1 , r 2 ) = h i e a, (g; r 1 ), Q e a, (g; r 2 ) = A1 (r 1 , r 2 ) + A2 (r 1 , r 2 ) + O( cov Q h i e a, (g; r 1 ), Q e a, (g; r 2 ) = A3 (r 1 , r 2 ) + A4 (r 1 , r 2 ) + O( cov Q 1 ⇡ 2d d 2a X min(a,a+m) X m= 2a k=max( a, a+m) g(! k )g(! k ! m )Sinc(u Z R2d f( 2u ! k )f ( 2v m⇡)Sinc(v + (m + r 1 29 d n ), d n ), + ! k + ! r1 ) r 2 )⇡)Sinc(u)Sinc(v)dudv and expressions for A2 (r 1 , r 2 ), . . . , A4 (r 1 , r 4 ) can be found in Lemma E.2, Subba Rao (2015b). Interestingly supa |A1 (r 1 , r 2 )| < 1, supa |A2 (r 1 , r 2 )| < 1, supa |A3 (r 1 , r 2 )| < 1 and supa |A4 (r 1 , r 2 )| < 1. Hence if d /n ! c (c < 1) we have h i h i d d e a, (g; r 1 ), Q e a, (g; r 2 ) < 1 and e a, (g; r 1 ), Q e a, (g; r 2 ) < 1, sup cov Q sup cov Q a a We now obtain simplified expressions for the terms A1 (r1 , r2 ), . . . , A4 (r1 , r2 ) under the slightly stronger condition that Assumption 2.5(c) also holds. Lemma A.2 Suppose Assumptions 2.4(ii) and 2.5(b,c) hold. Then for 0  |r1 |, |r2 |  C|a| (where C is some finite constant) we have ( C1,1 (!r ) + O(` ,a,n ) r1 = r2 (= r) A1 (r1 , r2 ) = O(` ,a,n ) r1 6= r2 A2 (r1 , r2 ) = ( C1,2 (!r ) + O(` O(` ,a,n ) A3 (r1 , r2 ) = ( C2,1 (!r ) + O(` O(` ,a,n ) A4 (r1 , r2 ) = ( C2,2 (!r ) + O(` O(` ,a,n ) ,a,n ) ,a,n ) r1 = r2 (= r) r1 6= r2 r1 = r2 (= r) r1 6= r2 and where C1,1 (!r ), . . . , C2,2 (!r ) (using d = 1) and ` ,a,n ,a,n ) r1 = r2 (= r) , r1 6= r2 are defined in Lemma E.2. PROOF. We first consider A1 (r1 , r2 ) and write it as A1 (r1 , r2 ) = Z 1Z 1 X 2a 1 sinc(u)sinc(v)sinc(u ⇡2 1 1 m⇡)sinc(v + (m + r1 m= 2a r2 )⇡)Hm, ✓ ◆ 2u 2v , ; r1 dudv, where Hm, ✓ 2u 2v , ; r1 ◆ = 1 min(a,a+m) X k=max( a, a+m) f ✓ 2u ◆ ✓ ◆ 2v + !k f + !k + !r g(!k )g(!k !m ) noting that f ( 2u !) = f (! 2u ). If f ( 2u + !k ) and f ( 2v + !k + !r ) are replaced with f (!k ) and f (!k + !r ) respectively, then we can exploit the orthogonality property of the sinc functions. This requires the following series of approximations. 30 (i) We start by defining a similar version of A1 (r1 , r2 ) but with the sum replaced with an integral. Let B1 (r1 r2 ; r1 ) Z 1Z 1 X 2a 1 = sinc(u)sinc(u ⇡2 1 1 m⇡)sinc(v)sinc(v + (m + r1 m= 2a r2 )⇡)Hm ✓ ◆ 2u 2v , ; r1 dudv, where Hm ✓ 2u 2v , ; r1 ◆ 1 = 2⇡ Z 2⇡ min(a,a+m)/ f (! 2u )f ( 2v 2⇡ max( a, a+m)/ + ! + !r1 )g(!)g(! !m )d!. By using Lemma C.4, Subba Rao (2015b) we have |A1 (r1 , r2 ) B1 (r1 r2 ; r1 )| = O ✓ log2 a ◆ . (ii) Define the quantity C1 (r1 r2 ; r1 ) Z 1Z 1 X 2a 1 = sinc(u)sinc(u ⇡2 1 1 m⇡)sinc(v)sinc(v + (m + r1 r2 )⇡)Hm (0, 0; r1 )dudv. m= 2a By using Lemma C.5, Subba Rao (2015b), we can replace B1 (r1 r2 ; r1 ) with C1 (r1 to give the replacement error ✓  ◆ (log a + log ) 2 |B1 (r1 r2 ; r1 ) C1 (r1 r2 ; r1 )| = O log (a) . (iii) Finally, we analyze C1 (r1 out of the integral to give r2 ; r1 ) r2 ; r1 ). Since Hm (0, 0; r1 ) does not depend on u or v we take it C1 (r1 r2 ; r1 ) ✓Z ◆ ✓Z 2a 1 X = Hm (0, 0; r1 ) sinc(u)sinc(u m⇡)du sinc(v)sinc(v + (m + r1 ⇡2 R R m= 2a ✓Z ◆ ✓Z ◆ 1 2 = H0 (0, 0; r1 ) sinc (u)du sinc(v)sinc(v + (r1 r2 )⇡)dv , ⇡2 R R r2 )⇡)dv where the last line of the above is due to orthogonality of the sinc function (see Lemma A.1). If r1 6= r2 , then by orthogonality of the sinc function we have C1 (r1 r2 ; r1 ) = 0. On the other hand if r1 = r2 we have Z 2⇡a/ 1 C1 (0; r) = f (!)f (! + !r )|g(!)|2 d! = C1,1 (!r ). 2⇡ 2⇡a/ The proof for the remaining terms A2 (r1 , r2 ), A3 (r1 , r2 ) and A4 (r1 , r2 ) is identical, thus we omit the details. ⇤ 31 ◆ Theorem A.1 Suppose Assumptions 2.1, 2.2, 2.4(ii) and 2.5(b,c) hold. Then for 0  |r1 |, |r2 |  C|a| (where C is some finite constant) we have ( h i C1 (! r ) + O(` ,a,n ) r 1 = r 2 (= r) d e a, (g; r 1 ), Q e a, (g; r 2 ) = cov Q O(` ,a,n ) r 1 6= r 2 d h i e a, (g; r 1 ), Q e a, (g; r 2 ) = cov Q ( C2 (! r ) + O(` O(` ,a,n ) where C1 (! r ) and C2 (! r ) are defined in (79) and  log a + log 2 ` ,a,n = log (a) ,a,n ) r1 = r 1 6= r 2 (= r) , r2 d + n . (35) PROOF. By using Lemmas E.2 and A.2 we immediately obtain (in the case d = 1) ( h i C1 (!r ) + O(` ,a,n ) r1 = r2 (= r) e a, (g; r1 ), Q e a, (g; r2 ) = cov Q O(` ,a,n ) r1 6= r2 and h e a, (g; r1 ), Q e cov Q ,a,n (g; r2 ) i = ( C2 (!r ) + O(` O(` ,a,n ) ,a,n ) r1 = r2 (= r) . r1 6= r2 This gives the result for d = 1. To prove the result for d > 1 we use the same procedure outlined in the proof of Lemma A.2 and the above. ⇤ The above theorem means that the variance for both the bounded and unbounded frequency grid are equivalent (up to the limits of an integral). Using the above results and the Lipschitz continuity of g(·) and f (·) we can show that ✓ ◆ krk1 Cj (! r ) = Cj + O , where C1 and C2 are defined in Corollary 3.1. We now derive an expression for the variance for the non-Gaussian case. Theorem A.2 Let us suppose that {Z(s); s 2 Rd } is a fourth order stationary spatial random field that satisfies Assumption 2.1(i). Suppose all the assumptions in Theorem 3.6 hold with the exception of Assumption 2.3 which is replaced with Assumption 2.2 (i.e. we assume the locations are uniformly sampled). Then for krk1 , kr 2 k1 << we have ( (a )d h i C + D + O(` + + krk1 ) r 1 = r 2 (= r) 1 1 ,a,n d n2 e a, (g; r 1 ), Q e a, (g; r 2 ) = cov Q (36) d O(` ,a,n + (an2) ) r 1 6= r 2 d h i e a, (g; r 1 ), Q e a, (g; r 2 ) = cov Q ( C2 + D2 + O(` O(` 32 ,a,n ,a,n + (a )d n2 (a )d ) n2 + + krk1 ) r1 = r 2 (= r) r 1 6= r2 , (37) where C1 and C2 are defined in Corollary 3.1 and Z 1 D1 = g(! 1 )g(! 2 )f4 ( ! 1 , ! 2 , ! 2 )d! 1 d! 2 (2⇡)2d 2⇡[ a/ ,a/ ]2d Z 1 D2 = g(! 1 )g(! 2 )f4 ( ! 1 , ! 2 , ! 2 )d! 1 d! 2 . (2⇡)2d 2⇡[ a/ ,a/ ]2d PROOF We prove the result for the notationally simple case d = 1. By using indecomposable partitions, conditional cumulants and (81) in Subba Rao (2015b) we have ✓ ◆ h i e a, (g; r1 ), Q e a, (g; r2 ) = A1 (r1 , r2 ) + A2 (r1 , r2 ) + B1 (r1 , r2 ) + B2 (r1 , r2 ) + O cov Q , (38) n where A1 (r1 , r2 ) and A2 (r1 , r2 ) are defined below equation (81) and  a X B1 (r1 , r2 ) = c4 g(!k1 )g(!k2 )E 4 (s2 s1 , s3 s1 , s4 s1 )eis1 !k1 e is2 !k1 +r1 e is3 !k2 is4 !k2 +r2 e k1 ,k2 = a X B2 (r1 , r2 ) = a X n4 j1 ,...,j4 2D3 k1 ,k2 =  E 4 (sj2 sj1 , sj3 a g(!k1 )g(!k2 ) ⇥ s j1 , sj 4 sj1 )eisj1 !k1 e isj2 !k1 +r1 e isj3 !k2 isj4 !k2 +r2 e with c4 = n(n 1)(n 2)(n 3)/n2 D3 = {j1 , . . . , j4 ; j1 6= j2 and j3 6= j4 but some j’s are in common}. The limits of A1 (r1 , r2 ) and A2 (r1 , r2 ) are given in Lemma A.2, therefore, all that remains is to derive bounds for B1 (r1 , r2 ) and B2 (r1 , r2 ). We will show that B1 (r1 , r2 ) is the dominating term, whereas by placing sufficient conditions on the rate of growth of a, we will show that B2 (r1 , r2 ) ! 0. In order to analyze B1 (r1 , r2 ) we will use the result ( Z ⇡ 3 r1 = r 2 sinc(u1 + u2 + u3 + (r2 r1 )⇡)sinc(u2 )sinc(u3 )du1 du2 du3 = , (39) 0 r1 6= r2 . R3 which follows from Lemma A.1. In the following steps we will make a series of approximations which will allow us to apply (39). We start by substituting the Fourier representation of the cumulant function Z 1 4 (s1 s2 , s1 s3 , s1 s4 ) = f4 (!1 , !2 , !3 )ei(s1 s2 )!1 ei(s1 s3 )!2 ei(s1 s4 )!4 d!1 d!2 d!3 , (2⇡)3 R3 into B1 (r1 , r2 ) to give B1 (r1 , r2 ) = c4 (2⇡)3 e 3 a X g(!k1 )g(!k2 ) k1 ,k2 = a is2 (!1 +!k1 +r1 ) R3 f4 (!1 , !2 , !3 ) is3 (!2 +!k2 ) is4 ( !3 +!k2 +r2 ) e Z Z eis1 (!1 +!2 +!3 +!k1 ) [ /2, /2]4 ds1 ds2 ds3 ds4 d!1 d!2 d!3 ✓ ◆ ✓ ◆ c4 (!1 + !2 + !3 ) !1 = g(!k1 )g(!k2 ) f4 (!1 , !2 , !3 )sinc + k1 ⇡ sinc + (k1 + r1 )⇡ (2⇡)3 2 2 R3 k1 ,k2 = a ✓ ◆ ✓ ◆ !2 !3 ⇥sinc + k2 ⇡ sinc (k2 + r2 )⇡ d!1 d!2 d!3 . 2 2 a X e Z 33 Now we make a change of variables and let u1 = !3 (k2 + r2 )⇡, this gives 2 c4 B1 (r1 , r2 ) = ⇡3 2 a X Z 3 k1 ,k2 = a R g(!k1 )g(!k2 )f4 ⇥sinc (u1 + u2 + u3 + (r2 ✓ !1 2 2u1 !2 2 + (k1 + r1 ), u2 = !k1 +r1 , 2u2 ! k2 , + k2 ⇡ and u3 = 2u3 + !k2 +r2 ◆ ⇥ r1 )⇡) sinc(u1 )sinc(u2 )sinc(u3 )du1 du2 du3 . Next we exchange the summand with a double integral and use Lemma E.1(iii) together with Lemma C.1, equation (44) (in Subba Rao (2015a)) to obtain ✓ ◆ Z Z c4 2u1 2u2 2u3 B1 (r1 , r2 ) = g(!1 )g(!2 )f4 ! 1 ! r1 , !2 , + ! 2 + ! r2 ⇥ (2⇡)2 ⇡ 3 2⇡[ a/ ,a/ ]2 R3 ✓ ◆ 1 sinc (u1 + u2 + u3 + (r2 r1 )⇡) sinc(u1 )sinc(u2 )sinc(u3 )du1 du2 du3 d!1 d!2 + O . By using Lemma C.3, we replace f4 ( 2u1 f4 ( !1 !r1 , !2 , !2 + !r2 ), this gives !r1 , 2u2 !1 !2 , 2u3 + !2 + !r2 ) in the integral with B1 (r1 , r2 ) Z c4 g(!1 )g(!2 )f4 ( !1 !r1 , !2 , !2 + !r2 ) ⇥ = (2⇡)2 ⇡ 3 2⇡[ a/ ,a/ ]2 ✓ 3 ◆ Z log sinc(u1 + u2 + u3 + (r2 r1 )⇡)sinc(u1 )sinc(u2 )sinc(u3 )du1 du2 du3 d!1 d!2 + O R3 ✓ 3 ◆ Z c4 Ir1 =r2 log = g(!1 )g(!2 )f4 ( !1 !r1 , !2 , !2 + !r2 )d!1 d!2 + O , 2 (2⇡) 2⇡[ a/ ,a/ ]2 where the last line follows from the orthogonality relation of the sinc function in equation (39). Finally we make one further approximation. In the case r1 = r2 we replace f4 ( !1 !r1 , !2 , !2 + !r2 ) with f4 ( !1 , !2 , !2 ) which by using the Lipschitz continuity of f4 and Lemma C.1, equation (44) gives ✓ 3 ◆ Z c4 log |r1 | B1 (r1 , r1 ) = g(!1 )g(!2 )f4 ( !1 , !2 , !2 )d!1 d!2 + O + . (2⇡)2 2⇡[ a/ ,a/ ]2 Altogether this gives = B1 (r1 , r2 ) ( R 1 (2⇡)2 2⇡[ a/ ,a/ ]2 g(!1 )g(!2 )f4 ( !1 , !2 , !2 )d!1 d!2 + O( log O( log 3 ) 3 + |r1 |+|r2 | ) r 1 = r2 r1 6= r2 . . Next we show that B2 (r1 , r2 ) is asymptotically negligible. To bound B2 (r1 , r2 ) we decompose D3 into six sets, the set D3,1 = {j1 , . . . , j4 ; j1 = j3 , j2 and j4 are di↵erent)}, D3,2 = {j1 , . . . , j4 ; j1 = j4 , j2 and j3 are di↵erent), D3,3 = {j1 , . . . , j4 ; j2 = j3 , j1 and j4 are di↵erent), D3,4 = {j1 , . . . , j4 ; j2 = j4 , j1 and j3 are di↵erent), D2,1 = {j1 , . . . , j4 ; j1 = j3 and j2 = j4 }, D2,2 = {j1 , . . . , j4 ; j1 = j4 and 34 j2 = j3 }. Using this decomposition we have B2 (r1 , r2 ) = where B2,(3,1) (r1 , r2 ) = P4 j=1 B2,(3,j) (r1 , r2 )+ a X |D3,1 | g(!k1 )g(!k2 ) ⇥ n4 k1 ,k2 = a  E 4 (sj2 sj1 , 0, sj4 sj1 )eisj1 !k1 e isj2 !k1 +r1 e P2 j=1 B2,(2,j) (r1 , r2 ), isj1 !k2 isj4 !k2 +r2 e for j = 2, 3, 4, B2,(3,j) (r1 , r2 ) are defined similarly, B2,(2,1) (r1 , r2 ) = a X |D2,1 | g(!k1 )g(!k2 ) ⇥ n4 k1 ,k2 = a  E 4 (sj2 sj1 , 0, sj2 sj1 )eisj1 !k1 e isj2 !k1 +r1 e isj1 !k2 isj2 !k2 +r2 e , B2,(2,2) (r1 , r2 ) is defined similarly and | · | denotes the cardinality of a set. By using identical methods to those used to bound B1 (r1 , r2 ) we have  |B2,(3,1) (r1 , r2 )| a X C ✓ Z (!1 + !3 ) |g(!k1 )g(!k2 )| |f4 (!1 , !2 , !3 )| sinc + (k2 k1 )⇡ 3 n(2⇡) 2 3 R k1 ,k2 = a ✓ ◆ ✓ ◆ ✓ ◆ !1 !3 sinc (k1 + r1 )⇡ ⇥ sinc + (k2 + r2 )⇡ d!1 d!2 d!3 = O . 2 2 n ◆ ⇥ Similarly we can show |B2,(3,j) (r1 , r2 )| = O( n ) (for 2  j  4) and |B2,(2,1) (r1 , r2 )| a X ✓ Z (!1 + !2 )  |g(!k1 )g(!k2 )| |f4 (!1 , !2 , !3 )| sinc + (k2 (2⇡)3 n2 2 3 R k1 ,k2 = a ✓ ◆ ✓ ◆ (!1 + !2 ) a sinc + (k2 k1 + r2 r1 )⇡ d!1 d!2 d!3 = O . 2 n2 ◆ k1 )⇡ ⇥ This immediately gives us (36). To prove (37) we use identical methods. Thus we obtain the result. ⇤ Note that the term B2,(2,1) (r1 , r2 ) in the proof above is important as it does not seem possible to improve on the bound O(a /n2 ). References Bandyopadhyay, B., & Subba Rao, S. (2015). A test for stationarity for irregularly spaced spatial data. To appear in the Journal of the Royal Statistical Society (B). 35 Bandyopadhyay, S., & Lahiri, S. (2009). Asymptotic properties of discrete fourier transforms for spatial data. Sankhya, Series A, 71, 221-259. Bandyopadhyay, S., Lahiri, S., & Norman, D. (2015). A frequency domain empirical likelihood method for irregular spaced spatial data. Annals of Statistics, 43, 519-545. Beutler, F. J. (1970). Alias-free randomly timed sampling of stochastic processes. IEEE Transactions on Information Theory, 16, 147-152. Bickel, J. P., & Ritov, Y. (1988). Estimating integrated squared density derivatives: Sharp best order of convergence. Sankhya Ser A, 50, 381-393. Brillinger, D. (1969). The calculation of cumulants via conditioning. Annals of the Institute of Statistical Mathematics, 21, 215-218. Brillinger, D. (1981). Time series, data analysis and theory. San Francisco: SIAM. Can, S., Mikosch, T., & Samorodnitsky, G. (2010). Weak convergence of the function-indexed integrated periodogram for infinite variance processes. Bernoulli, 16, 995-1015. Chen, W. W., Hurvich, C., & Lu, Y. (2006). On the correlation matrix of the discrete Fourier Transform and the fast solution of large toeplitz systems for long memory time series. Journal of the American Statistical Association, 101, 812-821. Cressie, N., & Huang, H.-C. (1999). Classes of nonseperable spatiotemporal stationary covariance functions. Journal of the American Statistical Association, 94, 1330-1340. Dahlhaus, R. (1989). Efficient parameter estimation for self-similar processes. Annals of Statistics, 17, 749-1766. Dahlhaus, R., & Janas, D. (1996). Frequency domain bootstrap for ratio statistics in time series analysis. Annals of Statistics, 24, 1934-1963. Deo, R. S., & Chen, W. W. (2000). On the integral of the squared periodogram. Stochastic processes and their applications, 85, 159-176. Dunsmuir, W. (1979). A central limit theorem for parameter estimation in stationary vector time series and its application to models for a signal observed with noise. Annals of Statistics, 7, 490-506. Fox, R., & Taqqu, M. (1987). Central limit theorems for quadratic forms in random variables having long-range dependence. Probability Theory and Related Fields, 74, 213-240. Fuentes, M. (2007). Approximate likelihood for large irregular spaced spatial data. Journal of the American Statistical Association, 136, 447-466. 36 Gine, G., & Nickl, R. (2008). A simple adaptive estimator of the integrated square of a density. Bernoulli, 14, 47-61. Giraitis, L., & Surgailis, D. (1990). A central limit theorem for quadratic forms in strongly dependent linear variables and its application to asymptotical normality of whittle’s estimate. Probability Theory and Related Fields, 86, 87-104. Hall, P., Fisher, N. I., & Ho↵man, B. (1994). On the nonparametric estimation of covariance functions. Annals of Statistics, 22, 2115-2134. Hall, P., & Patil, P. (1994). Properties of nonparametric estimators of autocovariance for stationary random fields. Probability Theory and Related Fields, 99, 399-424. Hannan, E. J. (1971). Non-linear time series regression. Advances in Applied Probability, 8, 767-780. Kawata, T. (1959). Some convergence theorems for stationary stochastic processes. The Annals of Mathematical Statistics, 30, 1192-1214. Lahiri, S. (2003). Central limit theorems for weighted sums of a spatial process under a class of stochastic and fixed designs. Sankhya, 65, 356-388. Laurent, B. (1996). Efficient estimation of integral functions of a density. Annals of Statistics, 24, 659-681. Masry, E. (1978). Poisson sampling and spectral estimation of continuous time processes. IEEE Transactions on Information Theory, 24, 173-183. Matsuda, Y., & Yajima, Y. (2009). Fourier analysis of irregularly spaced data on Rd . Journal of the Royal Statistical Society (B), 71, 191-217. Niebuhr, T., & Kreiss, J.-P. (2014). Asymptotics for autocovariances and integrated periodograms for linear processes observed at lower frequencies. International Statistical Review, 82, 123140. Rice, J. (1979). On the estimation of the parameters of a power spectum. Journal of Multivariate Analysis, 9, 378-392. Shapiro, H., & Silverman, R. A. (1960). Alias free sampling of random noise. J SIAM, 8, 225-248. Stein, M., Chi, Z., & Welty, L. J. (2004). Approximating likelihoods for large spatial data sets. Journal of the Royal Statistical Society (B), 66, 1-11. Stein, M. L. (1994). Fixed domain asymptotics for spatial periodograms. Journal of the American Statistical Association, 90, 1277-1288. Stein, M. L. (1999). Interpolation of spatial data: Some theory for kriging. New York: Springer. 37 Subba Rao, S. (2015a). Proxy samples: A simple method for estimating nuisance parameters in time series. Preprint. Subba Rao, S. (2015b). Supplementary material: Fourier based statistics for irregular spaced spatial data. Taqqu, M., & Peccati, G. (2011). Wiener chaos: moments, cumulants and diagrams. a survey with computer implementation. Milan: Springer. Vecchia, A. V. (1988). Estimation and model identification for continuous spatial processes. Journal of the Royal Statistical Society (B), 50, 297-312. Walker, A. M. (1964). Asymptotic properties of least suqares estimates of paramters of the spectrum of stationary non-deterministic time series. Journal of the Australian Mathematical Society, 4, 363-384. 38 Supplementary material B Proofs in the case of non-uniform sampling In this section we prove the results in Section 3. Most of the results derived here are based on the methodology developed in the uniform sampling case. PROOF of Theorem 2.2 We prove the result for d = 1. It is straightforward to show 1 X c2 cov [Jn (!k1 ), Jn (!k2 )] = j1 j2 j1 ,j2 = 1 + c(0) k1 k2 n Z Z /2 /2 /2 s2 )eis1 !j1 eis2 !j2 eis1 !k1 c(s1 is2 !k2 ds1 ds2 /2 . Thus, by using the same arguments as those used in the proof of Theorem 2.1, Bandyopadhyay and Subba Rao (2015), we can show that ! Z ! Z /2 1 /2 c2 X cov [Jn (!k1 ), Jn (!k2 )] = e is(!k2 !j2 !j1 !k1 ) ds c(t)eit(!j1 +!k1 ) dt j1 j2 /2 j1 ,j2 = 1 + c(0) k1 k2 n ✓ ◆ 1 +O where to obtain the remainder O( 1 ) we use that Z /2 e is(!k2 !j2 !j1 !k1 ) /2 , P1 j= 1 | j | ds = /2 ( 0 k2 k2 < 1. Next, by using the identity j2 = 6 k1 + j 1 j 2 = k1 + j 1 we have cov [Jn (!k1 ), Jn (!k2 )] = c2 = 1 X j k2 k1 j j= 1 1 X Z /2 c(t)eit(!j1 +!k1 ) dtds + /2 j k2 k1 j f (!k1 +j ) + c(0) k1 k2 n j= 1 c(0) k1 k2 n ✓ ◆ 1 1 +O + . n + O( 1 ) Finally, we replace f (!k1 +j ) with f (!k1 ) and use the Lipschitz continiuity of f (·) to give 1 X j= 1 j k2 k1 j [f (!k1 ) f (!k1 +j )]  where the last line follows from | j |  C|j| cov [Jn (!k1 ), Jn (!k2 )] = f (!k1 ) 1 C X (1+ ) I(j 1 X j= 1 j k2 k1 + c(0) k1 k2 n This completes the proof for d = 1, the proof for d > 1 is the same. 39 j| = O ✓ ◆ 1 , 6= 0). Altogether, this gives j k2 k1 j j= 1 |j| · | +O ✓ ◆ 1 . ⇤ PROOF of Theorem 3.2 We prove the result for the case d = 1. Using the same method used to prove Theorem 3.1 (see the arguments in Section 3.1) we obtain h i e a, (g; r) E Q ✓ ◆ ✓ ◆ Z 1 1 a X X c2 ! ! = + (k + j1 )⇡ sinc + (k + r j2 )⇡ d!. g(!k ) f (!)sinc j1 j2 2⇡ 2 2 1 j1 ,j2 = 1 k= a ! 2 By the change of variables y = h i e a, (g; r) E Q c2 ⇡ = 1 X j1 j2 j1 ,j2 = 1 a X + (k + j1 )⇡ we obtain g(!k ) k= a Z 1 1 f ✓ ◆ 2y !k sinc(y)sinc(y + (r Replacing sum with an integral and using Lemma E.1(ii) gives h i e E Qa, (g; r) ◆ ✓ Z 2⇡a/ Z 1 1 X c2 2y = g(!)f ! d! sinc(y)sinc(y + (r j1 j2 2⇡ 2 2⇡a/ 1 j1 j1 j2 )⇡)dy. j2 )⇡)dy + O j1 ,j2 = 1 Next, replacing f ( 2y h i e a, (g; r) E Q = c2 2⇡ 2 1 X ✓ ◆ 1 . !) with f ( !) and we have j1 j2 Z j1 j2 Z j1 ,j2 = 1 2⇡a/ g(!)f 2⇡a/ ✓ 2y ◆ ! d! Z 1 sinc(y)sinc(y + (r j1 j2 )⇡)dy + Rn + O 1 ✓ ◆ 1 . where Rn = c2 2⇡ 2 1 X j1 ,j2 = 1 2⇡a/ 2⇡a/  ✓ 2y g(!) f ! ◆ f ( !) Z 1 sinc(y)sinc(y + (r j1 j2 )⇡)dyd!. 1 By using Lemma C.2, Subba Rao (2015b) we have |Rn |  C 1 X j1 ,j2 = 1 | j1 | ·| j2 | log + log |r j1 j2 | C 1 X j1 ,j2 = 1 | j1 | ·| j2 | log + log |r| + log |j1 | + log |j2 | noting that C is a generic constant that changes between inequalities. This gives h i e a, (g; r) E Q ✓ ◆ Z 2⇡a/ Z 1 1 X c2 log + log |r| = g(!)f ( !)d! sinc(y)sinc(y + (r j1 j2 )⇡)dy + O . j1 j2 2⇡ 2 2⇡a/ 1 j1 ,j2 = 1 Finally, by the orthogonality of the sinc function at integer shifts (and f ( !) = f (!)) we have ✓ ◆ Z 2⇡a/ 1 h i X log + log |r| 1 e a, (g; r) = 1 E Q g(!)f (!)d! + O + , j r j 2⇡ n 2⇡a/ j= 1 40 , ⇤ thus we obtain the desired result. PROOF of Theorem 3.3(i) To prove (i) we use Theorem 2.2 and Lemma D.1 which immediately gives the result. PROOF of Theorem 3.3(ii) We first note that by using Lemma D.1 (generalized to nonuniform sampling), we can show that ✓ ◆ h i e e cov Qa, (g; r1 ), Qa, (g; r2 ) = A1 (r1 , r2 ) + A2 (r1 , r2 ) + O (40) n where a X A1 (r1 , r2 ) = k1 ,k2 = a ⇥ ⇤ g(!k1 )g(!k2 )cov Z(s1 ) exp(is1 !k1 ), Z(s3 ) exp(is3 !k2 ) ⇥ ⇥ ⇤ cov Z(s2 ) exp( is2 !k1 +r1 ), Z(s4 ) exp( is4 !k2 +r2 ) a X ⇥ ⇤ A2 (r1 , r2 ) = g(!k1 )g(!k2 )cov Z(s1 ) exp(is1 !k1 ), Z(s4 ) exp( is4 !k2 +r2 ) ⇥ k1 ,k2 = a ⇥ ⇤ cov Z(s2 ) exp( is2 !k1 +r1 ), Z(s3 ) exp(is3 !k2 ) . We first analyze A1 (r1 , r2 ). Conditioning on the locations s1 , . . . , s4 gives = A1 (r1 , r2 ) 1 X j1 j2 j3 j4 j1 ,...,j4 = 1 eis1 !k1 a X is3 !k2 g(!k1 )g(!k2 ) Z 1 3 k1 ,k2 = a e [ /2, /2]4 c(s1 is2 !k1 +r1 +is4 !k2 +r2 i(s1 !j1 +s2 !j2 +s3 !j3 +s4 !j4 ) e s3 )c(s2 s4 ) ds1 ds2 ds3 ds4 . By using the spectral representation theorem and integrating out s1 , . . . , s4 we can write the above as A1 (r1 , r2 ) ◆ x = g(!k1 )g(!k2 ) f (x)f (y)sinc + (k1 + j1 )⇡ j1 j2 j3 j4 (2⇡)2 2 1 1 j1 ,...,j4 = 1 k1 ,k2 = a ✓ ◆ ✓ ◆ ✓ ◆ y x y sinc (r1 + k1 j2 )⇡ sinc + (k2 j3 )⇡ sinc (r2 + k2 + j4 )⇡ dxdy 2 2 2 Z 1Z 1 1 a X X 1 2u 2v = g(!k1 )g(!k2 ) f( !k1 +j1 )f ( + !k1 +r1 j2 ) j1 j2 j3 j4 2 ⇡ 1 1 1 X j1 ,...,j4 = 1 k1 j1 j3 )⇡)sinc(v)sinc(v By making a change of variables m = k1  ⇡2 j1 ,...,j4 = 1 1 Z ✓ 1 k1 ,k2 = a ⇥sinc(u)sinc(u + (k2 |A1 (r1 , r2 )| 1 X 1 Z a X j1 j2 j3 j4 1 X (r2 + k2 + j4 r1 k1 + j2 )⇡)dudv. (41) k2 we have a X g(!k1 )g(!m m= 1 k1 = a ⇥sinc(u)sinc(u + (m + j1 + j3 )⇡)sinc(v)sinc(v + (m 41 k1 ) Z r2 1 1 Z 1 1 |f ( j 4 + r1 2u !k1 +j1 )f ( j2 )⇡)dudv. 2v + !k1 +r1 j2 )| Now by following the same series of bounds used to prove Lemma E.2(iii) we have |A1 (r1 , r2 )|  1 ⇡2 ⇥ 1 X j1 ,...,j4 = 1 Z 1 X 2d m= 1 R < 1. | j1 j2 j3 j4 | sup |g(!)| ! |sinc(u 2 kf k22 (m + j1 + j3 )⇡sinc(v + (m r2 + r1 j4 j2 )⇡)sinc(u)sinc(v)|dudv h i e a, (g; r1 ), Q e a, (g; r2 ) , thus giving the required result. Similarly we can bound A2 (r1 , r2 ) and cov Q ⇤ PROOF of Theorem 3.3(iii) The proof uses the expansion (40). Using this as a basis, we will show that A1 (r1 , r1 ) = U1 (r1 , r2 ; !r1 , !r2 ) + O(` ,a,n ) A2 (r1 , r1 ) = U2 (r1 , r2 ; !r1 , !r2 ) + O(` ,a,n ). We find an approximation for A1 (r1 , r2 ) starting with the expansion given in (41). We use the same P proof as that used to prove Lemma A.2 to approximate the terms inside the sum j1 ,...,j4 . More precisely we let m = k1 k2 , replace !k1 with ! and by using the same methodology given in the P proof of Lemma A.2 (and that j | j | < 1), we have = A1 (r1 , r2 ) 1 X 1 2⇡ 3 Z ⇥ j1 j2 j3 j4 j1 ,...,j4 = 1 R2 Z 2a X 2⇡ min(a,a+m)/ f( ! !j1 )f (! + !r1 m= 2a 2⇡ max( a, a+m)/ sinc(u)sinc(u + (m + j1 + j3 )⇡)sinc(v)sinc(v (m r2 j 4 + r1 j2 )g(!)g(! j2 )⇡)dudv + O(` By orthogonality of the sinc function we see that the above is zero unless m = m = r2 r1 + j2 + j4 (and using that f ( ! !j1 ) = f (! + !j1 )), therefore = A1 (r1 , r2 ) X 1 2⇡ j1 j2 j3 j4 j1 +...+j4 =r1 r2 Z !m )d! j1 ,a,n ). j3 and 2⇡ min(a,a j1 j3 )/ 2⇡ max( a, a j1 j3 )/ g(!)g(! + !j1 +j3 )f (! + !j1 )f (! + !r1 j2 )d! + O(` This gives us U1,1 (r1 , r2 ; !r1 , !r2 ). Next we consider A2 (r1 , r2 ) A2 (r1 , r2 ) = 1 3 X j1 ,j2 ,j3 ,j4 2Z j1 j2 j3 j4 a X g(!k1 )g(!k2 ) k1 ,k2 = a Z [ /2, /2]4 c(s1 s4 )c(s2 exp(is4 !k2 +r2 +j4 ) exp( is2 !k1 +r1 j2 ) exp( is3 !k2 j3 )ds1 ds2 ds3 ds4 ✓ ◆ Z a X X x = g(!k1 )g(!k2 ) f (x)f (y)sinc + (k1 + j1 )⇡ ⇥ 2 R2 j1 ,j2 ,j3 ,j4 2Z k1 ,k2 = a ✓ ◆ ✓ ◆ ✓ x y y sinc (k2 + r2 + j4 )⇡ sinc (k1 + r1 j2 )⇡ sinc + (k2 2 2 2 42 s3 ) exp(is1 !k1 +j1 ) ◆ j3 )⇡ dxdy. ,a,n ). x 2 Making a change of variables u = A2 (r1 .r2 ) = 1 ⇡2 X j1 ,j2 ,j3 ,j4 2Z sinc(u)sinc (u + (k1 + j1 )⇡ and v = a X j1 j2 j3 j4 y 2 g(!k1 )g(!k2 ) k1 ,k2 = a (k1 + r1 Z R2 f ✓ j2 )⇡ we have ◆ ✓ 2v !k1 +j1 f + !k1 +r1 2u (k2 + r2 + j4 + k1 + j1 )⇡) sinc(v)sinc (v + (k2 j 3 + k 1 + r1 j2 ◆ ⇥ j2 )⇡) dudv. Again by using the same proof as that given in Lemma A.2 to approximate the terms inside the P sum j1 ,...,j4 (setting m = k1 + k2 and replacing !k1 with !), we can approximate A2 (r1 , r2 ) with = A2 (r1 , r2 ) X 1 2⇡ 3 j1 j2 j3 j4 j1 ,j2 ,j3 ,j4 2Z sinc(u)sinc (u Z 2a X 2⇡ min(a,a+m)/ g(!)g( ! + !m ) m= 2a 2⇡(max( a, a+m)/ (m + r2 + j4 + j1 )⇡) sinc(v)sinc (v + (m j 3 + r1 Z R2 f( ! !j1 )f (! + !r1 j2 )⇡) dudvd! + O(` Using the orthogonality of the sinc function, the inner integral is non-zero when m j3 + r1 and m + r2 + j4 + j1 = 0. Setting m = r1 + j2 + j3 , this implies A2 (r1 , r2 ) = 1 2⇡ X j1 j2 j3 j4 j1 +j2 +j3 +j4 =r1 r2 ⇥f (! + !j1 )f (! + !r1 j2 )d! Z 2⇡ min(a,a r1 +j2 +j3 )/ g(!)g( ! ! r1 2⇡ max( a, a r1 +j2 +j3 )/ + O(` j3 j2 ) ,a,n ), thus giving us U1,2 (r 1 , r 2 ; ! r1 , ! r2 ). By using Lemma D.1, we can show that h where e a, (g; r1 ), Q e a, (g; r2 ) cov Q A3 (r1 , r2 ) = a X k1 ,k2 = a i = A3 (r1 , r2 ) + A4 (r1 , r2 ) + O ✓ ◆ n ⇥ ⇤ g(!k1 )g(!k2 )cov Z(s1 ) exp(is1 !k1 ), Z(s3 ) exp( is3 !k2 ) ⇥ ⇥ ⇤ cov Z(s2 ) exp( is2 !k1 +r1 ), Z(s4 ) exp(is4 !k2 +r2 ) a X ⇥ ⇤ A4 (r1 , r2 ) = g(!k1 )g(!k2 )cov Z(s1 ) exp(is1 !k1 ), Z(s4 ) exp(is4 !k2 +r2 ) ⇥ k1 ,k2 = a ⇥ ⇤ cov Z(s2 ) exp( is2 !k1 +r1 ), Z(s3 ) exp( is3 !k2 ) . 43 ,a,n ). j2 = 0 j2 ) ⇥ By following the same proof as that used to prove A1 (r1 , r2 ) we have A3 (r1 , r2 ) 1 X = Z a X 1 Z ✓ 1 x + (k1 + j1 )⇡ 2 ◆ j4 )⇡ dxdy g(!k1 )g(!k2 ) f (x)f (y)sinc j1 j2 j3 j4 (2⇡)2 1 1 j1 ,...,j4 = 1 k1 ,k2 = a ✓ ◆ ✓ ◆ ✓ x y y sinc (k2 + j3 )⇡ sinc (r1 + k1 j2 )⇡ sinc + (r2 + k2 2 2 2 Z 1Z 1 1 a X X 1 2u 2v = g(!k1 )g(!k2 ) f( !k1 +j1 )f ( + !k1 +r1 j1 j2 j3 j4 2 ⇡ 1 1 j1 ,...,j4 = 1 ⇥sinc(u)sinc(u = 1 X 1 ⇡2 j2 ) k1 ,k2 = a (k2 + k1 + j1 + j3 )⇡)sinc(v)sinc(v + (r2 + k2 min(a,a+m) 2a X j1 j2 j3 j4 j1 ,...,j4 = 1 ⇥sinc(u)sinc(u ◆ X g(!k1 )g(!m m= 2a k1 =max( a, a+m) (m + j1 + j3 )⇡)sinc(v)sinc(v + (r2 + m j4 + r1 + k1 j2 )⇡)dudv Z 1Z 1 2u 2v f( !k1 +j1 )f ( + !k1 +r1 k1 ) 1 j 4 + r1 1 j2 )⇡)dudv. Again using the method used to bound A1 (r1 , r2 ) gives A3 (r1 , r2 ) = 1 2⇡ 3 X j1 j2 j3 j4 m j1 +j2 +j3 +j4 =r1 +r2 f (! + !j1 )f (! + !r1 XZ j2 )d! + O(` 2⇡ min(a,a j1 j3 )/ g(!)g(! 2⇡ max( a, a j1 j3 )/ ,a,n ) j1 j3 = U1,2 (r 1 , r 2 ; ! r1 , ! r2 ) + O(` !) ⇥ ,a,n ). Finally we consider A4 (r1 , r2 ). Using the same expansion as the above we have = A4 (r1 , r2 ) X 1 3 j1 j2 j3 j4 j1 ,j2 ,j3 ,j4 2Z a X g(!k1 )g(!k2 ) k1 ,k2 = a exp( is4 !k2 +r2 j4 ) exp( is2 !k1 +r1 Z a X X = g(!k1 )g(!k2 ) j1 ,j2 ,j3 ,j4 2Z k1 ,k2 = a sinc ✓ ◆ x + (k2 + r2 2 X j4 )⇡ sinc = 1 ⇡2 = sinc(u)sinc (u + (k2 + r2 X 1 = j1 j2 j3 j4 j1 ,j2 ,j3 ,j4 2Z 2⇡ 3 sinc(u)sinc (u + (r2 X 1 2⇡ f (x)f (y)sinc y 2 j4 s4 )c(s2 s3 ) exp(is1 !k1 +j1 ) g(!k1 )g(!k2 ) Z ◆ x + (k1 + j1 )⇡ ⇥ 2 ◆ ✓ ◆ y j2 )⇡ sinc (k2 + j3 )⇡ dxdy 2 R2 f( 2u !k1 +j1 )f ( 2v + !k1 +r1 j2 ) ⇥ j4 k1 j1 )⇡) sinc(v)sinc (v (k2 + j3 k1 r1 + j2 )⇡) dudv Z 2⇡ min(a,a+m) Z 2a X g(!)g(! !m ) f ( ! !j1 )f (! + !r1 j4 m= 2a 2⇡ max( a, a+m)/ m j1 j2 j3 j1 +j2 +j3 +j4 =r1 +r2 ⇥f (! + !j1 )f (! + !r1 c(s1 ✓ (k1 + r1 k1 ,k2 = a j1 j2 j3 j1 ,j2 ,j3 ,j4 2Z a X /2, /2]4 [ j2 ) exp(is3 !k2 +j3 )ds1 ds2 ds3 ds4 R2 ✓ Z j2 )d! R2 j1 )⇡) sinc(v)sinc (v (j3 m r1 + j2 )⇡) dudvd! + O(` Z 2⇡ min(a,a r1 +j2 +j3 )/ g(!)g(! !j2 +j3 r2 ) j4 max( a,a r1 +j2 +j3 )/ + O(` ,a,n ). 44 ,a,n ) j2 ) ⇥ j2 ⇤ This gives the desired result. We now obtain and approximation of U1 and U2 . PROOF of Corollary 3.1 By using Lipschitz continuity of g(·) and f (·) and | j |  C 0) we obtain the result. Qd i=1 |ji | (1+ ) I(j i ⇤ 6= PROOF of Theorem 3.6 We prove the result for the case d = 1 and using A1 (r1 , r2 ), . . . , A4 (r1 , r2 ) defined in proof of Theorem 3.3. The proof is identical to the proof of Theorem A.2. Following the same notation in proof of Theorem A.2 we have ✓ ◆ h i e e cov Qa, (g; r1 ), Qa, (g; r2 ) = A1 (r1 , r2 ) + A2 (r1 , r2 ) + B1 (r1 , r2 ) + B2 (r1 , r2 ) + O , n with |B2 (r1 , r2 )| = O((a )2 /n2 ) and the main term involving the trispectral density is B1 (r1 , r2 ) a X = c4 k1 ,k2 = a = c4 (2⇡)3 3  g(!k1 )g(!k2 )E 4 (s1 1 X j1 j2 j3 j4 j1 ,...,j4 = 1 j2 ) a X s 3 , s1 g(!k1 )g(!k2 ) k1 ,k2 = a e is3 (!2 +!k2 j3 ) s4 )eis1 !k1 e Z R3 is2 !k1 +r1 f4 (!1 , !2 , !3 ) Z e is3 !k2 is4 !k2 +r2 [ /2, /2]4 e eis1 (!1 +!2 +!3 +!k1 +j1 ) eis4 ( !3 +!k2 +r2 +j4 ) ds1 ds2 ds3 ds4 d!1 d!2 d!3 ✓ ◆ Z 1 a X X c4 (!1 + !2 + !3 ) = g(!k1 )g(!k2 ) f4 (!1 , !2 , !3 )sinc + (k1 + j1 )⇡ j1 j2 j3 j4 (2⇡)3 2 R3 j1 ,...,j4 = 1 k1 ,k2 = a ✓ ◆ ✓ ◆ ✓ ◆ !1 !2 !3 ⇥sinc + (k1 + r1 j2 )⇡ sinc + (k2 j3 )⇡ sinc (k2 + r2 + j4 )⇡ d!1 d!2 d!3 . 2 2 2 e is2 (!1 +!k1 +r1 s 2 , s1 Now we make a change of variables and let u1 = u3 = !2 3 (k2 + r2 + j4 )⇡, this gives = B1 (r1 , r2 ) 1 X c4 ⇡3 2 j1 j2 j3 j4 j1 ,...,j4 = 1 ⇥sinc (u1 + u2 + u3 + (r2 a X Z 3 k1 ,k2 = a R !1 2 + (k1 + r1 g(!k1 )g(!k2 )f4 ✓ 2u1 j2 ), u2 = !k1 +r1 !2 2 j2 , + (k2 2u2 j3 )⇡ and ! k2 j3 , 2u3 + !k2 +r2 +j4 r1 + j1 + j2 + j3 + j4 )⇡) sinc(u1 )sinc(u2 )sinc(u3 )du1 du2 du3 . Next we exchange the summand with a double integral and use Lemma E.1(iii) together with Lemma C.1, equation (44) to obtain B1 (r1 , r2 ) = Z 1 X c4 g(!1 )g(!2 ) ⇥ j1 j2 j3 j4 ⇡ 3 (2⇡)2 2 2⇡[ a/ ,a/ ] j1 ,...,j4 = 1 ✓ ◆ Z 2u1 2u2 2u3 f4 ! 1 ! r1 j 2 , !2 !j3 , + !2 + !r2 +j4 ⇥ R3 sinc (u1 + u2 + u3 + (r2 r1 + j1 + j2 + j3 + j4 )⇡) sinc(u1 )sinc(u2 )sinc(u3 )du1 du2 du3 d!1 d!2 + O 45 ✓ ◆ 1 . ◆ By using Lemma C.3, we replace f4 2u1 !1 !r1 f4 ( !1 !r1 j2 , !2 !j3 , !2 + !r2 +j4 ), to give j2 , 2u2 !2 !j3 , 2u3 + !2 + !r2 +j4 with B1 (r1 , r2 ) Z 1 X c4 = g(!1 )g(!2 ) ⇥ j1 j2 j3 j4 ⇡ 3 (2⇡)2 2⇡[ a/ ,a/ ]2 j1 ,...,j4 = 1 Z f4 ( !1 !r1 j2 , !2 !j3 , !2 + !r2 +j4 ) sinc (u1 + u2 + u3 + (r2 R3 ✓ 3 ◆ log ( ) ⇥sinc(u1 )sinc(u2 )sinc(u3 )du1 du2 du3 d!1 d!2 + O Z X c4 = g(!1 )g(!2 ) ⇥ j1 j2 j3 j4 (2⇡)2 2 2⇡[ a/ ,a/ ] j1 +j2 +j3 +j4 =r2 r1 ✓ 3 ◆ log ( ) f4 ( !1 !r1 j2 , !2 !j3 , !2 + !r2 +j4 ) d!1 d!2 + O , r1 + j1 + j2 + j3 + j4 )⇡) where the last line follows from (39).h i e a, (g; r1 ), Q e a, (g; r2 ) we note that To obtain an expression for cov Q ✓ ◆ h i e a, (g; r1 ), Q e a, (g; r2 ) = A3 (r1 , r2 ) + A4 (r1 , r2 ) + B3 (r1 , r2 ) + B4 (r1 , r2 ) + O cov Q , n just as in the proof of Theorem A.2 we can show that |B4 (r1 , r2 )| = O(( a)d /n2 ) and the leading term involving the trispectral density is B3 (r1 , r2 ) a X = c4  g(!k1 )g(!k2 )E 4 (s1 k1 ,k2 = a = c4 (2⇡)3 3 1 X a X j1 j2 j3 j4 j1 ,...,j4 = 1 is2 (!1 +!k1 +r1 j2 ) s 2 , s1 s4 )eis1 !k1 e s 3 , s1 g(!k1 )g(!k2 ) k1 ,k2 = a is3 (!2 !k2 +j3 ) is4 ( !3 !k2 +r2 R3 e f4 (!1 , !2 , !3 ) Z is4 !k2 +r2 e eis1 (!1 +!2 +!3 +!k1 +j1 ) [ /2, /2]4 j4 ) ds1 ds2 ds3 ds4 d!1 d!2 d!3 ✓ ◆ c4 (!1 + !2 + !3 ) = g(!k1 )g(!k2 ) f4 (!1 , !2 , !3 )sinc + (k1 + j1 )⇡ j1 j2 j3 j4 (2⇡)3 2 3 R j1 ,...,j4 = 1 k1 ,k2 = a ✓ ◆ ✓ ◆ ✓ ◆ !1 !2 !3 ⇥sinc + (k1 + r1 j2 )⇡ sinc (k2 + j3 )⇡ sinc + (k2 + r2 j4 )⇡ d!1 d!2 d!3 . 2 2 2 e e e a X Z is2 !k1 +r1 is3 !k2 1 X We make a change of variables u1 = !3 j4 )⇡. This gives 2 + (k2 + r2 Z !1 2 + (k1 + r1 B3 (r1 , r2 ) = c4 2 (2⇡)3 1 X j1 j2 j3 j4 j1 ,...,j4 = 1 a X k1 ,k2 = a ⇥sinc (u1 + u2 + u3 + (j1 + j2 + j3 + j4 Z X 1 = j1 j2 j3 j4 (2⇡)2 2⇡[ j1 +j2 +j3 +j4 =r1 +r2 g(!k1 )g(!k2 ) r1 Z R3 f4 ( 2u1 !2 2 (k2 + j3 )⇡ and u3 = !k1 +r1 j2 , 2u2 + !k2 +j3 , 2u3 ! k2 r2 +j4 ) r2 )⇡) sinc(u1 )sinc(u2 )sinc(u3 )du1 du2 du3 a/ ,a/ ]2 46 j2 )⇡ u2 = g(!1 )g(!2 )f4 ( !1 ! r1 j 2 , !2 + !j3 , !2 ! r2 +j4 )d!1 d!2 Finally, by replacing f4 ( !1 !r1 j2 , !2 !j3 , !2 + !r2 +j4 ) with f4 ( !1 , !2 , !2 ) in B1 (r1 , r2 ) and f4 ( !1 !r1 j2 , !2 + !j3 , !2 ! r2 +j4 ) with f4 ( !1 , !2 , !2 ) and using the pointwise Lips⌘ ⇣ 3 log ( ) (1+ ) +1 chitz continuity of f4 and that | j |  CI(j 6= 0)|j| we obtain B1 (r1 , r2 ) = D1 +O ⇣ 3 ⌘ and B3 (r1 , r2 ) = D3 + O log ( ) + 1 . Thus giving the required result. ⇤ C Technical Lemmas We first prove Lemma A.1, then state four lemmas, which form an important component in the proofs of this paper. Through out this section we use C to denote a finite generic constant. It is worth mentioning that many of these results build on the work of T. Kawata (see Kawata (1959)). PROOF of Lemma A.1 We first prove (30). By using partial fractions and the definition of the sinc function we have ✓ ◆ Z 1 Z 1 1 1 1 sinc(u)sinc(u + x)du = sin(u) sin(u + x) du x 1 u u+x 1 Z Z 1 1 1 sin(u) sin(u + x) sin(u) sin(u + x) = du du. x 1 u u+x 1 For the second integral we make a change of variables u0 = u + x, this gives Z 1 Z Z 1 1 sin(u) sin(u + x) 1 1 sin(u0 ) sin(u0 sinc(u)sinc(u + x)du = du x 1 u x 1 u0 1 ✓ ◆ Z 1 1 sin(u) = sin(u + x) sin(u x) du x 1 u Z 2 sin(x) 1 cos(u) sin(u) ⇡ sin(x) = du = . x u x 1 To prove (31), it is clear that for x = s⇡ (with s 2 Z{0}) ⇡ sin(s⇡) s⇡ x) du0 ⇤ = 0, which gives the result. The following result is used to obtain bounds for the variance and higher order cumulants. Lemma C.1 Define the function `p (x) = C/e for |x|  e and `p (x) = C logp |x|/|x| for |x| e. (i) We have Z 1 1 | sin(x) sin(x + y)| dx  |x(x + y)| Z 1 ( C log|y||y| C |y| e |y| < e = `1 (y), (42) |sinc(x)|`p (x + y)dx  `p+1 (y) (43) 0 1 p p X Y @ A sinc xj sinc(xj ) dx1 . . . dxp  C, (44) 1 and Z Rp j=1 j=1 47 (ii) Z a X m= a 1 1 sin2 (x) dx  C log2 a x(x + m⇡) (iii) Z 1 1 Z 1 a X 1 m= sin2 (x) sin2 (y) dxdy  C, |x(x + m⇡)| |y(y + m⇡)| a (45) (iv) a X m1 ,...,mq 1= Z q a R qY1 j=1 sinc(xq )sinc(xq + ⇡ sinc(xj )sinc(xj + mj ⇡) ⇥ q 1 X q Y mj ) j=1 j=1 dxj  C log2(q 2) (a), where C is a finite generic constant which is independent of a. R1 sin(x+y)| PROOF. We first prove (i), equation (42). It is clear that for |y|  e that 1 | sin(x) dx  C. |x(x+y)| Therefore we now consider the case |y| > e, without loss of generality we prove the result for y > e. Partitioning the integral we have Z 1 | sin(x) sin(x + y)| dx = I1 + I2 + I3 + I4 + I5 , |x(x + y)| 1 where I1 = Z y 0 | sin(x) sin(x + y)| dx |x(x + y)| Z I2 = y | sin(x) sin(x + y)| dx |x(x + y)| 2y Z 1 | sin(x) sin(x + y)| = dx. |x(x + y)| y I3 = I5 Z I4 = 0 Z | sin(x) sin(x + y)| dx |x(x + y)| y 2y 1 | sin(x) sin(x + y)| dx |x(x + y)| To bound I1 we note that for y > 1 and x > 0 that | sin(x + y)/(x + y)|  1/y, thus Z 1 y | sin(x)| log y I1 = dx  C . y 0 |x| y To bound I2 , we further partition the integral I2 =  Z y/2 | sin(x) sin(x + y)| dx + |x(x + y)| y Z y/2 Z 2 | sin(x + y)| 2 dx + y y |(x + y)| y 48 Z 0 y/2 0 y/2 | sin(x) sin(x + y)| dx |x(x + y)| | sin(x)| log y dx  C . |x| y To bound I3 , we use the bound 1 I3  y To bound I4 we use that for y > 0, Z y 2y R1 y I4  | sin(x + y)| log y dx  C . |(x + y)| y x Z 2 dx y 1  C|y| 1, thus 1 dx  C|y| x2 1 and using a similar argument we have I5  C|y| 1 . Altogether, this gives (42). R1 We now prove (43). It is clear that for |y|  e that 1 |sinc(x)|`p (x + y)|dx  C. Therefore we now consider the case |y| > e, without loss of generality we prove the result for y > e. As in (42) we partition the integral Z 1 |sinc(x)`p (x + y)|dx = II1 + . . . + II5 , 1 where II1 , . . . , II5 are defined in the same way as I1 , . . . , I5 just with |sinc(x)`p (x + y)| replacing | sin(x) sin(x+y)| . To bound II1 we note that |x(x+y)| II1 = Z y 0 logp (y) |sinc(x)`p (x + y)|dx  y Z y 0 p+1 |sinc(x)|dx  logp+1 (y) , y p+1 we use similar method to show II2  C log y (y) and II3  C log y (y) . Finally to bound II4 and II5 we note that by using a change of variables x = yz, we have Z 1 Z 1 Z | sin(x)| logp (x + y) logp (x + y) 1 1 [log(y) + log(z + 1)]p logp (y) II5 = dx  dx = dz  C . x(x + y) x(x + y) y 1 z(z + 1) y y y p Similarly we can show that II4  C logy(y) . Altogether, this gives the result. To prove (44) we recursively apply (43) to give Z Rp |sinc(x1 + . . . + xp )| p Y j=1 |sinc(xj )|dx1 . . . dxp   Z Z Rp R 1 |`p |`1 (x1 + . . . + xp 1) 1 (x1 )sinc(x1 )|dx1 pY1 sinc(xj )|dx1 . . . dxp j=1 = O(1), thus we have the required the result. P P To bound (ii), without loss of generality we derive a bound over am=1 , the bounds for ma is identical. Using (42) we have a Z 1 a Z 1 X X sin2 (x) | sin(x) sin(x + m⇡)| dx = dx |x(x + m⇡)| 1 |x(x + m⇡)| 1  m=1 a X m=1 m=1 a X log(m⇡) `1 (m⇡) = C m⇡ m=1 a X 1  C log(a⇡) = C log(a⇡) log(a)  C log2 a. m⇡ m=1 49 1 Thus we have shown (ii). To prove (iii) we use (42) to give ✓Z 1 a X m= a a X  C m= a 1 sin2 (x) dx |x(x + m⇡)| log m m 2 C 1 X ◆ ✓Z m= 1 1 1 sin2 (y) dy |y(y + m⇡)| log m m 2 ◆  C. To prove (iv) we apply (42) to each of the integrals this gives a X m1 ,...,mq  a X m1 ,...,mq 1= Z q a R qY1 sinc(xj )sinc(xj + mj ⇡)sinc(xq )sinc(xq + ⇡ j=1 `1 (m1 ⇡) . . . `1 (mq 1= q 1 X mj ) j=1 1 ⇡)`1 (mq 1 ⇡)`1 (⇡ a q 1 X j=1 mj )  C log2(q q Y dxj j=1 2) a, ⇤. thus we obtain the desired result. The proofs of Theorem 3.1, Theorems A.1 (ii), Theorem 2.2 involve integrals of sinc(u)sinc(u + m⇡), where |m|  a + |r1 | + |r2 | and that a ! 1 as ! 1. In the following lemma we obtain bounds for these integrals. We note that all these results involve an inner integral di↵erence of the form  ✓ ◆ Z b 2u g(!) h ! + h(!) d!du . b By using the mean value theorem heuristically it is clear that O( 1 ) should come out of the integral however the 2u in the numerator of h ! + 2u makes the analysis quite involved. Lemma C.2 Suppose h is a function which is absolutely integrable and |h0 (!)|  (!) (where is a montonically decreasing function that is absolutely integrable) and m 2 Z and g(!) is a bounded function. b can take any value, and the bounds given below are independent of b. Then we have  ✓ ◆ Z 1 Z b 2u sinc(u)sinc(u + m⇡) g(!) h ! + h(!) d!du 1  C b log( ) + I(m 6= 0) log(|m|) , (46) where C is a finite constant independent of m and b. If g(!) is a bounded function with a bounded first derivative, then we have ◆ ✓ Z 1 Z b 2u sinc(u)sinc(u + m⇡) h(!) g(! + ) g(!) d!du 1  C b log( ) + I(m 6= 0) log(|m|) . (47) R In the case of double integrals, we assume h(·, ·) is such that R2 |h(!1 , !2 )|d!1 d!2 < 1, and 2 h(! ,! ) 1 ,!2 ) 1 ,!2 ) 1 2 |  (!1 ) (!2 ) (where is a | @h(! |  (!1 ) (!2 ), | @h(! |  (!1 ) (!2 ) and | @ @! @!1 @!2 1 @!2 50 montonically decreasing function that is absolutely integrable) and g(·, ·) is a bounded function. Then if m1 6= 0 and m2 6= 0 we have Z 1Z 1 Z b Z b sinc(u1 )sinc(u1 + m1 ⇡)sinc(u2 )sinc(u2 + m2 ⇡) g(!1 , !2 ) ⇥ 1 and 1  ✓ ◆ 2u1 2u2 h !1 + , !2 + Z 1 1 Z 1 b h(!1 , !2 ) d!1 d!2 du1 du2  C  ✓ ◆ 2u1 2u2 h ! k1 + , !k 2 + where a ! 1 as h(!k1 , !k2 ) du1 du2  C ! 1. ) + log(|mi |)] i=1 [log( sinc(u1 )sinc(u1 + m1 ⇡)sinc(u2 )sinc(u2 + m2 ⇡) 1 b Q2 2 b b 1 XX 2 b Q2 b (48) g(!k1 , !k2 ) ⇥ ) + log(|mi |)] i=1 [log( . 2 . (49) PROOF. To simplify the notation in the proof, we’ll prove (46) for m > 0 (the proof for m  0 is identical). The proof is based on considering the cases that |u|  and |u| > separately. For |u|  we apply the mean value theorem to the di↵erence h(! + 2u ) h(!) and for |u| > we exploit R that the integral u>| | |sinc(u)sinc(u + m⇡)|du decays as ! 1. We now make these argument precise. We start by partitioning the integral ✓ ◆ Z 1 Z b 2u sinc(u)sinc(u + m⇡) g(!) h(! + ) h(!) d!du = I1 + I2 , (50) where 1 b I1 = I2 = Z Z sinc(u)sinc(u + m⇡) |u|> sinc(u)sinc(u + m⇡) Z Z b ✓ g(!) h(! + b ✓ b g(!) h(! + b 2u 2u I12 = I13 = Z Z sinc(u)sinc(u + m⇡) m⇡ m⇡ sinc(u)sinc(u + m⇡) 1 I23 = Z Z sinc(u)sinc(u + m⇡) sinc(u)sinc(u + m⇡) ✓ b g(!) h(! + b Z b ✓ Z Z min(4|u|/ ,b) ✓ h(!) d!du. 2u b b min(4|u|/ ,b) 51 ) 2u g(!) h(! + min(4|u|/ ,b) min(4|u|/ ,b) ◆ h(!) d!du g(!) h(! + b and partition I2 = I21 + I22 + I23 , where Z Z I21 = sinc(u)sinc(u + m⇡) I22 = Z h(!) d!du ◆ ) We further partition the integral I1 = I11 + I12 + I13 , where ✓ Z 1 Z b 2u I11 = sinc(u)sinc(u + m⇡) g(!) h(! + ) b ) ◆ ◆ h(!) d!du ◆ ) 2u h(!) d!du ) ✓ 2u g(!) h(! + ) ✓ 2u g(!) h(! + ) ◆ h(!) d!du ◆ h(!) d!du ◆ h(!) d!du. We start by bounding I1 . Taking absolutes of I11 , and using that h(!) is absolutely integrable we have Z 1 sin2 (u) |I11 |  2 du, u(u + m⇡) R1 R 1 sin2 (u) where = supu |g(u)| 0 |h(u)|du. Since m > 0, it is straightforward to show that u(u+m⇡) du  1 1 C , where C is some finite constant. This implies |I11 |  2C . Similarly it can be shown 1 that |I13 |  2C . To bound I12 we note that ⇣ ⌘ ( Z 2 log m⇡ < 2 sin (u) 2 m⇡ |I12 |  du  ⇥ . m⇡ |u + m⇡| log + log(m⇡ ) m⇡ > Thus, we have |I12 |  2 1 ⇥ ⇤ log + log m . Altogether, the bounds for I11 , I12 , I13 give |I1 |  C(log + log m) To bound I2 we apply the mean value theorem to h(! + 0  ⇣(!, u)|  1. Substituting this into I23 gives |I23 |  2 Z sin2 (u) |u + m⇡| Z b . 2u ) h(!) = h0 (! + ⇣(!, u) 2u 0 h (! 2u + ⇣(!, u) 2u ), where ) d!du. min(4|u|/ ,b) Since the limits of the inner integral are greater than 4u/ , and the derivative is bounded by (!), this means |h0 (! + ⇣(!, u) 2u )|  max[ (!), (! + 2u )] = (!). Altogether, this gives !Z Z b 2 sin2 (u) 2 log( + m⇡) |I23 |  (!)d! du  . |u + m⇡| min(4|u|/ ,b) Using the same method we obtain I22  2 log( +m⇡) . Finally, to bound I21 , we cannot bound h0 (! + ⇣(!, u) 2u ) by a monotonic function since ! and ! + 2u can have di↵erent signs. Therefore we simply bound h0 (! + ⇣(!, u) 2u ) with a constant, this gives |I21 |  8C 2 Z |u| sin2 (u) 16C du  . |u + m⇡| Altogether, the bounds for I21 , I22 , I23 give |I2 |  C log + log(m) + log( + m⇡) . Finally, we recall that if > 2 and m⇡ > 2, then ( + m⇡) < m⇡, thus log( + m⇡)  log + log(m⇡), Therefore, we obtain (46). The proof of (47) is similar, but avoids some of the awkward details that are required to prove (46). To prove (48) we note that both m1 6= 0 and m2 6= 0 (if one of these values were zero the slower bound given in (46) only holds). Without loss of generality we will prove the result for m1 > 0 and 52 m2 > 0. We first note since m1 6= 0 and m2 6= 0, by orthogonality of the sinc function at integer shifts (see Lemma A.1) we have Z b Z b Z 1Z 1 sinc(u1 )sinc(u1 + m1 ⇡)sinc(u2 )sinc(u2 + m2 ⇡) g(!1 , !2 ) ⇥ b b  1 ✓ 1 ◆ 2u1 2u2 h !1 + , !2 + h(!1 , !2 ) d!1 d!2 du1 du2 Z = Z 1 1 sinc(u2 )sinc(u2 + m2 ⇡) sinc(u1 )sinc(u1 + m1 ⇡) 1 1 ✓ ◆ 2u1 2u2 ⇥h !1 + , !2 + d!1 du1 d!2 du2 , Z b b Z b g(!1 , !2 ) b (51) since in the last line h(!1 , !2 ) comes outside the integral over u1 and u2 . To further the proof, we note that for m2 6= 0 we have Z 1 Z Z b Z b sinc(u2 )sinc(u2 + m2 ⇡) sinc(u1 )sinc(u1 + m1 ⇡) g(!1 , !2 ) 1 |u1 |> b b ✓ ◆ 2u1 ⇥h !1 + , !2 d!1 du1 d!2 du2 = 0. (52) We use this zero-equality at relevant parts in the proof. We subtract (52) from (51), and use the same decomposition and notation used in (50) we R R decompose the integral in (51) over u1 into |u1 |> + |u1 | , to give = Z = Z ✓Z 1 Z 1 b Z b sinc(u2 )sinc(u2 + m2 ⇡) sinc(u1 )sinc(u1 + m1 ⇡) g(!1 , !2 ) ⇥ 1 1 b b  ✓ ◆ ✓ ◆ ◆ 2u1 2u2 2u1 h !1 + , !2 + h !1 + , !2 d!1 du1 d!2 du2 Z 1 Z b ✓Z 1 Z b = sinc(u2 )sinc(u2 + m2 ⇡) sinc(u1 )sinc(u1 + m1 ⇡) g(!1 , !2 ) ⇥ 1 b 1 b  ✓ ◆ ✓ ◆ ◆ 2u1 2u2 2u1 , !2 + h !1 + , !2 d!1 du1 d!2 du2 h !1 + 1 sinc(u2 )sinc(u2 + m2 ⇡) 1 Z b [I1 (u2 , !2 ) + I2 (u2 , !2 )] du2 d!2 = J1 + J2 , b where I1 (u2 , !2 ) = Z |u1 |> Z b I2 (u2 , !2 ) = Z b sinc(u1 )sinc(u1 + m1 ⇡) ⇥  ✓ ◆ 2u1 2u2 g(!1 , !2 ) h !1 + , !2 + sinc(u1 )sinc(u1 + m1 ⇡) |u1 | Z b ✓ h !1 + @h g(!1 , !2 ) @!1 b ✓ 2u1 , !2 ◆ d!1 du1 !1 + ⇣1 (!1 , u1 ) 2u1 , !2 + 2u2 ◆ d!1 du1 and |⇣1 (!1 , u1 )|  1. Note that the expression for I2 (u2 , !2 ) applies the mean value theorem to the di↵erence h !1 + 2u1 , !2 + 2u2 h !1 + 2u1 , !2 . Next to bound J1 we decompose the outer 53 integral over u2 into J11 = Z Z R J12 = Z Z b |u2 | to give J1 = J11 + J12 , where  ✓ ◆ 2u1 2u2 g(!1 , !2 ) h !1 + , !2 + Z |u1 |> R sinc(u1 )sinc(u1 + m1 ⇡)sinc(u2 )sinc(u2 + m2 ⇡) ⇥ |u1 |> |u2 |> Z b Z b b and |u2 |> ✓ h !1 + 2u1 , !2 ◆ d!1 du1 du2 d!2 b b sinc(u1 )sinc(u1 + m1 ⇡) ⇥ sinc(u2 )sinc(u2 + m2 ⇡) |u2 | Z b @h g(!1 , !2 ) @! 2 b ✓ !1 + 2u1 , !2 + ⇣2 (!2 , u2 ) 2u2 ◆ d!1 du1 du2 d!2 , and |⇣2 (!2 , u2 )| < 1. Note that the expression for J12 is obtained by applying the mean value theorem to h !1 + 2u1 , !2 + 2u2 h !1 + 2u1 , !2 . Now by using the same methods used to bound I1 and I2 in (50) we can show that |J11 | = O( 2 ), |J12 | = O([log( ) + log |m2 |]/ 2 ) and |J21 | = O( 2 ). To bound J2 we again use that m2 6= 0 and orthogonality of the sinc function to @h add the additional term @! !1 + ⇣1 (!1 , u1 ) 2u1 , !2 (whose total contribution is zero) to J2 to give 1 J2 = Z Z 1 1 b Z b sinc(u2 )sinc(u2 + m2 ⇡) b  @h g(!1 , !2 ) @! 1 b ✓ Z |u1 | !1 + ⇣1 (!1 , u1 ) 2u1 sinc(u1 )sinc(u1 + m1 ⇡) ⇥ , !2 + 2u2 ◆ @h @!1 ✓ !1 + ⇣1 (!1 , u1 ) 2u1 , !2 ◆ d!1 d!2 du1 du2 . R R By decomposing the outer integral of J2 over u2 into |u2 |> and |u2 | we have J2 = J21 + J22 , where Z Z b Z J21 = sinc(u2 )sinc(u2 + m2 ⇡) sinc(u1 )sinc(u1 + m1 ⇡) ⇥ Z |u2 |> b b  @h g(!1 , !2 ) @!1 b ✓ |u1 | !1 + ⇣1 (!1 , u1 ) 2u1 , !2 + 2u2 ◆ @h @!1 ✓ !1 + ⇣1 (!1 , u1 ) 2u1 , !2 ◆ d!1 d!2 du1 du2 and J22 = Z |u2 | Z b Z b sinc(u2 )sinc(u2 + m2 ⇡) b @2h g(!1 , !2 ) @!1 @!2 b ✓ Z |u1 | !1 + ⇣1 (!1 , u1 ) 2u1 sinc(u1 )sinc(u1 + m1 ⇡) ⇥ , !2 + ⇣3 (!2 , u2 ) 2u2 ◆ d!1 d!2 du1 du2 with |⇣3 (!2 , u2 )| < 1. Note that the expression for J22 is obtained by applying the mean value 2u1 @h @h theorem to @! !1 + ⇣1 (!1 , u1 ) 2u1 , !2 + 2u2 , !2 . @!1 !1 + ⇣1 (!1 , u1 ) 1 Again by using the methods used to bound I1 and I2 in (50) we can show that |J21 | = O([log( )+ Q log |m1 |]/ 2 ) and |J22 | = O( 2i=1 [log( ) + log |mi |]/ 2 ). Altogether this proves (48). The proof of (49) follows exactly the same method used to prove (48) the only di↵erence is that the summand rather than the integral makes the notation more cumbersome. For this reason we omit the details. ⇤ 54 The following result is used in to obtain expression for the fourth order cumulant term (in the case that the spatial random field is not Gaussian). It is used in the proofs of Theorems 3.6 and A.2. Lemma C.3 Suppose h is a function which is absolutely integrable and |h0 (!)|  (!) (where is a monotonically decreasing function that is absolutely integrable), m 2 Z and g(!) is a bounded function. Then we have Z R3 = O sinc(u1 + u2 + u3 + m⇡)sinc(u1 )sinc(u2 )sinc(u3 ) ✓ [log( )| + log |m|]3 Z R2 = O ◆ a/ a/  ✓ 2u1 g(!) h ! ◆ h( !) d!du1 du2 du3 . (53) sinc(u1 + u2 + m⇡)sinc(u1 )sinc(u2 ) ✓ Z [log( ) + log |m|]3 ◆ Z a/ a/  ✓ 2u1 g(!) h ! ◆ h( !) d!du1 du2 . (54) PROOF. The proof of (53) is very similar to the proof of Lemma C.2. We start by partitioning the R R integral over u1 into |u1 | and |u1 |> Z sinc(u1 + u2 + u3 + m⇡)sinc(u1 )sinc(u2 )sinc(u3 ) ⇥ ✓ ◆ 2u1 g(!) h( !) h( !) d!du1 du2 du3 = I1 + I2 , R3 Z a/ a/ where I1 = Z |u1 |> ⇥ Z a/ R3 sinc(u1 + u2 + u3 + m⇡)sinc(u1 )sinc(u2 )sinc(u3 ) ✓ g(!) h( a/ I2 = Z |u1 | Z a/ ⇥ Z a/ Z R3 2u1 ◆ !) h( !) d!du1 du2 du3 sinc(u1 + u2 + u3 + m⇡)sinc(u1 )sinc(u2 )sinc(u3 ) ✓ g(!) h( 2u1 ◆ !) h( !) d!du1 du2 du3 . Taking absolutes of I1 and using Lemma C.1, equations (42) and (43) we have |I1 |  4 Z |u1 |> |sinc(u1 + m⇡)|`2 (u1 )du1  C 55 Z |u|> log2 (u) |sinc(u + m⇡)| ⇥ du, |u| |u + m⇡| R1 where = sup! |g(!)| 0 |h(!)|d! and C is a finite constant (which has absorbed ). Decomposing the above integral we have Z log2 (u) |sinc(u + m⇡)| |I1 |  C ⇥ du = I11 + I12 , |u| |u + m⇡| |u|> where I11 = I12 = Z Z <|u| (1+|m|) |u|> (1+|m|) log2 (u) |sinc(u1 + m⇡)| ⇥ du |u| |u1 + m⇡| log2 (u) |sinc(u1 + m⇡)| ⇥ du. |u| |u1 + m⇡| We first bound I11 I11  log2 [ (1 + |m|⇡)] 2  Z <|u| (1+|m|) 2C log [ (1 + |m|⇡)] |sinc(u1 + m⇡)| du |u1 + m⇡| ⇥ log( + m⇡) = C (log |m| + log )3 To bound I12 we make a change of variables u = z, the above becomes ✓ 2 ◆ Z C [log + log z]2 log ( ) I12  dz = O . z(z + m ) |z|>1+|m| Altogether the bounds for I11 and I12 give |I1 |  C(log |m| + log )3 / . To bound I2 , just as in Lemma C.2, equation (46), we decompose it into three parts I2 = I21 + I22 + I23 , where using Lemma C.1, equations (42) and (43) we have the bounds ✓ ◆ Z Z min(a,4u)/ 2u |I21 |  |sinc(u + m⇡)|`2 (u) |g(!)| h ! h( !) d!du |I22 |  |I23 |  Z Z |u| |u| |u| |sinc(u + m⇡)|`2 (u) |sinc(u + m⇡)|`2 (u) Z Z min(a,4|u|)/ min(a,4|u|)/ |g(!)| h a/ a/ min(a,4|u|)/ |g(!)| h ✓ ✓ 2u 2u ! ! ◆ ◆ h( !) d!du h( !) d!du. Using the same method used to bound |I21 |, |I22 |, |I23 | in Lemma C.2, we have |I21 |, |I22 |, |I23 |  C[log( ) + log(|m|)]3 / . Having bounded all partitions of the integral, we have the result. The proof of (54) is identical and we omit the details. ⇤ C.1 Lemmas required to prove Lemma A.2 and Theorem A.1 In this section we give the proofs of the three results used in Lemma A.2 (which in turn proves Theorem A.1). Lemma C.4 Suppose Assumptions 2.4(ii) and 2.5(b,c) holds. Then for r1 , r2 2 Z we have ✓ 2 ◆ log (a) |A1 (r1 , r2 ) B1 (r1 r2 ; r1 )| = O 56 PROOF. To obtain a bound for the di↵erence we use Lemma E.1(ii) to give |A1 (r1 , r2 ) B1 (r1 r2 ; r1 )| Z 1Z 1 X 2a = |sinc(u)sinc(u 1 Hm, ( | C  2u 2v 2u 2v , ; r1 ) Hm ( , ; r1 ) dudv {z } 2a X Z 2a X Z m= 2a C  m= 2a 2 = O( m⇡)sinc(v)sinc(v + (m + r1 r2 )⇡)| 1 m= 2a log a C/ 1 1 1 1 |sinc(u)sinc(u |sinc(u)sinc(u m⇡)|du Z | 1 1 |sinc(v)sinc(v + (m + r1 {z <1 m⇡)|du (from Lemma C.1(ii)) r2 )⇡)|dv } ), ⇤ thus giving the desired result. Lemma C.5 Suppose Assumptions 2.4(ii) and 2.5(b,c) holds (with 0  |r1 |, |r2 | < C|a|). Then we have ✓  ◆ log a + log 2 |B1 (s; r) C1 (s; r)| = O log (a) . Note, if we relax the assumption on r1 , r2 to r1 , r2 2 Z then the above bound requires the additional term log2 (a)[I(r1 6= 0) log |r1 | + I(r2  0) log |r2 |]/ . PROOF. Taking di↵erences, it is easily seen that B1 (s, r) C1 (s, r) Z 2a X = sinc(u)sinc(u R2 m= 2a Z 2⇡ min(a,a+m)/ 1 (2⇡) = I1 + I2 ⇥ 2⇡ max( a, a+m)/ m⇡)sinc(v)sinc(v + (m + s)⇡)  g(!)g(! + !m ) f (! 2u )f (! + 2v + !r ) f (!)f (! + !r ) d!dvdu where I1 = Z 2a X sinc(u)sinc(u R2 m= 2a Z 2⇡ min(a,a+m)/ m⇡)sinc(v)sinc(v + (m + s)⇡)  1 2v ⇥ g(!)g(! + !m )f (! + + !r ) f (! (2⇡) 2⇡ max( a, a+m)/ Z X 2a = sinc(v)sinc(v + (m + s)⇡)Dm (v)dv R m= 2a 57 2u ) f (!) d!dvdu and I2 = Z 2a X 1 ⇥ (2⇡) = sinc(u)sinc(u R2 m= 2a Z 2⇡ min(a,a+m)/ 2a X  g(!)g(! + !m )f (!) f (! + 2⇡ max( a, a+m)/ dm m= 2a m⇡)sinc(v)sinc(v + (m + s)⇡) Z R sinc(u)sinc(u 2v + !r ) f (! + !r ) d!dvdu m⇡)du with Dm (v) = Z Z 2⇡ min(a,a+m)/ 1 2v sinc(u)sinc(u m⇡) g(!)g(! + !m )f (! + + !r ) (2⇡) 2⇡ max( a, a+m)/ R  2u ⇥ f (! ) f (!) d!du and dm = Z Z 2⇡ min(a,a+m)/ 1 sinc(v)sinc(v + (m + s)⇡) g(!)g(! + !m )f (!) (2⇡) 2⇡ max( a, a+m)/ R  2v ⇥ f (! + + !r ) f (! + !r ) d!dv. Since the functions f (·) and g(·) satisfy the conditions stated in Lemma C.2, the lemma can be used to show that ✓ ◆ log + log a max sup |Dm (v)|  C |m|a v and max |dm |  C |m|a ✓ log + log a ◆ . Substituting these bounds into I1 and I2 give |I1 |  C |I2 |  C ✓ ✓ log + log a ◆ X Z 2a m= 2a log + log a ◆ X Z 2a m= 2a 1 1 1 1 |sinc(v)sinc(v + (m + s)⇡)|dv |sinc(u)sinc(u m⇡)|du. Therefore, by using Lemma C.1(ii) we have ✓ 2 |I1 | and |I2 | = O log (a) Since |B1 (s; r) log a + log C1 (s; r)|  |I1 | + |I2 | this gives the desired result. 58 ◆ . ⇤ D ea, (g; r) Approximations to the covariance and cumulants of Q e a, (g; r q ) , these results e a, (g; r 1 ), . . . , Q In this section, our objective is to obtain bounds for cumq Q e a, (g; r) (given in Section A.2) will be used to prove the asymptotic expression for the variance of Q e a, (g; r). Fox and Taqqu (1987), Dahlhaus (1989), Giraitis and and asymptotic normality of Q Surgailis (1990) (see also Taqqu and Peccati (2011)) have developed techniques for dealing with the cumulants of sums of periodograms of Gaussian (discrete time) time series, and one would have expected that these results could be used here. However, in our setting there are a few di↵erences that we now describe (i) despite the spatial random being Gaussian the locations are randomly sampled, thus the composite process Z(s) is not Gaussian (we can only exploit the Gaussianity when we condition on the location) (ii) the random field is defined over Rd (not Zd ) (iii) the number e a, (·) is not necessarily the sample size. Unfortunately, these di↵erences of terms in the sums Q make it difficult to apply the above mentioned results to our setting. Therefore, in this section we consider cumulant based results for spatial data observed at irregular locations. In order to reduce cumbersome notation we focus on the case that the locations are from a uniform distribution. e a, (1, 0)]. By using indecomposable partitions we As a simple motivation we first consider var[Q have = = e a, (1, 0)] var[Q n a X X 1 n4 1 n4 j1 ,j2 ,j3 ,j4 =1 j1 6=j2 ,j3 6=j4 n X j1 ,j2 ,j3 ,j4 =1 r1 6=r2 ,r3 6=r4 cov [Z(sj1 )Z(sj2 ) exp(i!k1 (sj1 sj2 )), Z(sj3 )Z(sj4 ) exp(i!k2 (sj3 sj4 ))] k1 ,k2 = a a X k1 ,k2 = a ✓ ⇥ cum Z(sj1 )eisj1 !k1 , Z(sj3 )e isj3 !k2 ⇤ ⇥ cum Z(sj2 )e isj2 !k1 , Z(sj4 )eisj4 !k2 ⇥ ⇤ ⇥ ⇤ +cum Z(sj1 )eisj1 !k1 , Z(sj4 )eisj4 !k2 cum Z(sj2 )e isj2 !k1 , Z(sj3 )e isj3 !k2 ◆ ⇥ ⇤ isj1 !k1 isj2 !k1 isj3 !k2 isj4 !k2 +cum Z(sj1 )e , Z(sj2 )e , Z(sj3 )e , Z(sj4 )e . (55) In order to evaluate the covariances in the above we condition on the locations {sj }. To evaluate the fourth order cumulant of the above we appeal to a generalisation of the conditional variance method. This expansion was first derived in Brillinger (1969), and in the general setting it is stated as cum(Y1 , Y2 , . . . , Yq ) = X ⇡ cum [cum(Y⇡1 |s1 , . . . , sq ), . . . , cum(Y⇡b |s1 , . . . , sq )] , (56) where the sum is over all partitions ⇡ of {1, . . . , q} and {⇡1 , . . . , ⇡b } are all the blocks in the partition ⇡. We use (56) to evaluate cum[Z(sj1 )eisj1 !k1 , . . . , Z(sjq )eisjq !kq ], where Yi = Z(sji )eisji !ki and we condition on the locations {sj }. Using this decomposition we observe that because the spatial process is Gaussian, cum[Z(sj1 )eisj1 !k1 , . . . , Z(sjq )eisjq !kq ] can only be composed of cumulants of covariances conditioned on the locations. Moreover, if s1 , . . . , sq are independent then by using the same reasoning we see that cum[Z(s1 )eis1 !k1 , . . . , Z(sq )eisq !kq ] = 0. Therefore, 59 ⇤ ⇥ ⇤ cum Z(sj1 )eisj1 !k1 , Z(sj2 )e isj2 !k1 , Z(sj3 )eisj3 !k2 , Z(sj4 )e isj4 (!k2 will only be non-zero if some elements of sj1 , sj2 , sj3 , sj4 are dependent. Using these rules we have = e a, (1, 0)] var[Q X 1 n4 a X j1 ,j2 ,j3 ,j4 2D4 k1 ,k2 = a ✓ ⇥ cum Z(sj1 )eisj1 !k1 , Z(sj3 )e ⇥ ⇤ ⇥ +cum Z(sj1 )eisj1 !k1 , Z(sj4 )eisj4 !k2 cum Z(sj2 )e 1 + 4 n X a X j1 ,j2 ,j3 ,j4 2D3 k1 ,k2 = a ✓ 1 n4 X a X j1 ,j2 ,j3 ,j4 2D3 k1 ,k2 = a isj2 !k1 ⇥ cum Z(sj1 )eisj1 !k1 , Z(sj3 )e ⇥ ⇤ ⇥ +cum Z(sj1 )eisj1 !k1 , Z(sj4 )eisj4 !k2 cum Z(sj2 )e + isj3 !k2 ⇥ cum Z(sj2 )e , Z(sj3 )e isj3 !k2 isj2 !k1 ⇥ cum Z(sj1 )eisj1 !k1 , Z(sj2 )e ⇤ ⇤ ⇤ isj3 !k2 ⇤ , Z(sj4 )eisj4 !k2 ◆ ⇥ cum Z(sj2 )e , Z(sj3 )e isj2 !k1 isj3 !k2 isj2 !k1 isj2 !k1 , Z(sj4 )eisj4 !k2 ◆ , Z(sj3 )eisj3 !k2 , Z(sj4 )e isj4 !k2 ⇤ Lemma D.1 Suppose Assumptions 2.1, 2.2, 2.4 and 2.5(b,c) hold (note we only use Assumption 2.5(c) to get ‘neater expressions in the proofs’ it is not needed to obtain the same order). Then we have a sup a a X cum[Z(s1 )eis1 !k1 , Z(s2 )e is2 !k1 , Z(s3 )e is3 !k2 , Z(s1 )eis1 !k2 ] = O(1) (58) k1 ,k2 = a a X k1 ,k2 = a ⇥ ⇤ ⇥ cum Z(s1 )eis1 !k1 , Z(s1 )eis1 !k2 cum Z(s2 )e is2 !k1 , Z(s3 )e is3 !k2 ⇤ = O(1). (59) PROOF. To show (58) we use conditional cumulants (see (56)). By using the conditional cumulant 60 ⇤ , (57) where D4 = {j1 , . . . , j4 = all js are di↵erent}, D3 = {j1 , . . . , j4 ; two js are the same but j1 6= e a, (1, 0) more than two elements in {j1 , . . . , j4 } j2 and j3 6= j4 } (noting that by definition of Q cannot be the same). We observe that |D4 | = O(n4 ) and |D3 | = O(n3 ), where | · | denotes the cardinality of a set. We will show that the second and third terms are asymptotically negligible with respect to the first term. To show this we require the following lemma. sup ⇤ expansion and Gaussianity of Z(s) conditioned on the location we have a X cum[Z(s1 )eis1 !k1 , Z(s2 )e k1 ,k2 = a a X = k1 ,k2 = a a X k1 ,k2 = a a X is2 !k1 , Z(s3 )e is3 !k2 , Z(s1 )eis1 !k2 ] cum[c(s1 s2 )eis1 !k1 is2 !k1 , c(s1 s3 )e is3 !k2 +is1 !k2 ]+ cum[c(s1 s3 )eis1 !k1 is3 !k2 , c(s1 s2 )e is2 !k1 +is1 !k2 ]+ cum[c(0)eis1 (!k1 +!k2 ) , c(s2 k1 ,k2 = a | is2 !k1 is3 !k2 s3 )e {z ] = I1 + I2 . } =0 (since s1 is independent of s2 and s3 ) Writing I1 in terms of expectations and using the spectral representation of the covariance we have I1 = a X k1 ,k2 = a ⇥ ✓ ⇥ E c(s1 ⇥E c(s1 = 1 (2⇡)2 is1 !k2 is3 !k2 Z Z a X 1 (2⇡)2 = E1 s3 )e k1 ,k2 = a Z a X k1 ,k2 = a s3 )eis1 (!k1 +!k2 ) e s2 )c(s1 Z ⇤ ◆ f (x)f (y)sinc f (x)f (y)sinc ✓ ✓ 2 2 = e is2 !k1 ⇤ ◆ ⇥ E c(s1 (x + y) + (k1 + k2 )⇡ sinc ◆ x + k2 ⇡ sinc ✓ 2 ✓ ◆ 2 s2 )eis1 !k1 ◆ x + k1 ⇡ sinc x + k2 ⇡ sinc ✓ 2 ✓ ◆ 2 is2 !k1 ⇤ ◆ y + k2 ⇡ dxdy y + k1 ⇡ sinc ✓ 2 ◆ y + k1 ⇡ dxdy E2 . To bound E1 we make a change of variables u = integral (and use Lemma E.1) to give E1 = is3 !k2 1 ⇡2 2 Z Z 1 (2⇡)2 ⇡ 2 a X f( 2u !k1 )f ( 2v k1 ,k2 = a Z Z sinc(u + v)sinc(u)sinc(v) x 2 + k1 ⇡, v = y 2 + k2 ⇡, and replace sum with !k2 )sinc(u + v)sinc(u)sinc(v)dudv ✓Z 2⇡a/ 2⇡a/ Z 2⇡a/ f( 2⇡a/ 2u !1 )f ( 2v ◆ 1 !2 )d!1 d!2 dudv + O( ). R 2⇡a/ 1 2u Let G( 2u ) = 2⇡ !)d!, then substituting this into the above and using equation (44) 2⇡a/ f ( in Lemma C.1 we have ✓ ◆ ✓ ◆ Z Z 2u 2v 1 E1 = sinc(u + v)sinc(u)sinc(v)G G dudv + O( ) = O(1). To bound E2 we use a similar technique and Lemma C.1(iii) to give E2 = O( 1 ). Altogether, this gives I1 = O(1). The same proof can be used to show that I2 = O(1). Altogether this gives (58). 61 ⇥ ⇤ To bound (59), we observe that if k1 6= k2 , then cum Z(s1 )eis1 !k1 , Z(s1 )eis1 !k2 = E[c(0)eis(!k1 +!k2 ) ] = ⇥ ⇤ 0 otherwise cum Z(s1 )eis1 !k , Z(s1 )e is1 !k = c(0). Using this, (59) can be reduced to a X ⇥ ⇤ ⇥ cum Z(s1 )eis1 !k1 , Z(s1 )eis1 !k2 cum Z(s2 )e k1 ,k2 = a a X = c(0) k= a ⇥ cum Z(s2 )eis2 !k , Z(s3 )e , Z(s3 )e is3 !k2 ⇤ ⇤ ◆ ✓ ◆ ✓ ◆ x x x = f (x)sinc + k⇡ sinc + k⇡ dx change variables using ! = + k⇡ 2 2 2 1 k= a ✓ ◆ Z 1 a 1 X 2! = c(0) f !k sinc2 (!)d! ⇡ 1 k= a ✓ Z 2⇡a/ ✓ ◆ ◆ ✓ ◆ Z 1 c(0) 2! 1 2 = sinc (!) f x dx d! + O = O(1), 2 2⇡ 1 2⇡a/ c(0) 2⇡ Z 1 ✓ is3 !k is2 !k1 a X ⇤ thus proving (59). e a, (1, 0)], by using Lemma D.1 we have We now derive an expression for var[Q = e a, (1, 0)] var[Q X 1 n4 a X j1 ,j2 ,j3 ,j4 2D4 k1 ,k2 = a ✓ ⇥ cum Z(sj1 )eisj1 !k1 , Z(sj3 )e ⇥ ⇤ ⇥ +cum Z(sj1 )eisj1 !k1 , Z(sj4 )eisj4 !k2 cum Z(sj2 )e isj3 !k2 isj2 !k1 ⇤ ⇥ cum Z(sj2 )e , Z(sj3 )e isj3 !k2 ⇤ isj2 !k1 ◆ , Z(sj4 )eisj4 !k2 ✓ ◆ 1 +O . n (60) In Lemma E.2 we have shown that the covariance terms above are of order O( 1 ), thus dominating the fourth order cumulant terms which is of order O(n 1 ) (so long as << n). Lemma D.2 Suppose that {Z(s); s 2 Rd } is a Gaussian random process and {sj } are iid random variables. Then the following hold: (i) As we mentioned above, by using that the spatial process is Gaussian and (56), cum[Z(sj1 )eisj1 !k1 , . . . , Z(sjq )eisjq !kq ] can be written as the sum of products of cumulants of the spatial covariance conditioned on location. Therefore, it is easily seen that the odd order cumulant cum2q+1 [Z(sj1 ) exp(isj1 ! k1 ), . . . , Z(sj2q+1 exp(isj2q+1 ! k2q+1 ))] = 0 for all q and regardless of {sj } being dependent or not. (ii) By the same reasoning given above, we observe that if more than (q + 1) locations {sj ; j = 1, . . . , 2q} are independent, then cum2q [Z(sj1 ) exp(isj1 ! k1 ), . . . , Z(sj2q ) exp(isj2q ! k2q ))] = 0. 62 ⇤ Lemma D.3 Suppose Assumptions 2.1, 2.2 2.4 and 2.5(b) are satisfied, and d = 1. Then we have ✓ 2 ◆ log (a) e cum3 [Qa, (g, r)] = O (61) 2 with d /(log2 (a)n) ! 0 as ! 1, n ! 1 and a ! 1. e a, (1, 0)], noting that the proof is identical for general g PROOF. We prove the result for cum3 [Q and r. We first expand cum3 [Q̃a, (1, 0)] using indecomposable partitions. Using Lemma D.2(i) we note that the third order cumulant is zero, therefore = cum3 [Q̃a, (1, 0)] a X 1 X n6 j2D k1 ,k2 ,k3 = a ⇥ cum Z(sj1 )Z(sj2 ) exp(i!k1 (sj1 sj2 )), ⇤ Z(sj3 )Z(sj4 ) exp(i!k2 (sj3 sj4 )), Z(sj5 )Z(sj6 ) exp(i!k3 (sj5 sj6 )) X 1 X 1 X X 1 X j j j = A (⇡ ) + A (⇡ ) + A6 (2,2,2) (4,2) 2,2,2 4,2 n6 n6 n6 j2D ⇡(2,2,2), 2P2,2,2 j2D ⇡4,2 2P4,2 j2D = B2,2,2 + B4,2 + B6 , j where D = {j1 , . . . , j6 2 {1, . . . , n} but j1 6= j2 , j3 6= j4 , j5 6= j6 }, A2,2,2 consists of only the product of cumulants of order two and P2,2,2 is the set of all cumulants of order two from the set j of indecomposable partitions of {(1, 2), (3, 4), (5, 6)}, A4,2 consists of only the product of 4th and 2nd order cumulants and P4,2 is the set of all 4th order and 2nd order cumulant indecomposable j partitions of {(1, 2), (3, 4), (5, 6)}, finally A6 is the 6th order cumulant. Examples of A’s are given below j A2,2,2 (⇡(2,2,2),1 ) a X = k1 ,k2 ,k3 = a ⇥ ⇤ ⇥ cum Z(sj1 )eisj1 !k1 , Z(sj3 )eisj3 !k2 cum Z(sj2 )e ⇥ ⇤ ⇥cum Z(sj4 )e isj4 !k2 , Z(sj6 )e isj6 !k3 a X = E[c(sj1 sj3 )ei(sj1 !k1 +sj3 !k2 ) ]E[c(sj2 sj5 )ei( isj2 !k1 sj2 !k1 +sj5 !k3 ) k1 ,k2 ,k3 = a E[c(sj4 j A4,2 (⇡(4,2),1 ) = a X sj6 )e i(sj4 !k2 +sj6 !k3 ) , Z(sj5 )eisj5 !k3 ⇤ ]⇥ ], (62) cum[Z(sj4 ) exp( isj4 !k2 ), Z(sj6 ) exp( isj6 !k3 )] k1 ,k2 ,k3 = a cum[Z(sj1 ) exp(isj1 !k1 ), Z(sj2 ) exp( isj2 !k1 ), Z(sj3 ) exp(isj3 !k2 ), Z(sj5 ) exp(isj5 !k3 )], (63) j A6 = a X cum[Z(sj1 ) exp(isj1 !k1 ), Z(sj2 ) exp( isj2 !k1 ), Z(sj3 ) exp(isj3 !k2 ), k1 ,k2 ,k3 = a Z(sj4 ) exp( isj4 !k2 ), Z(sj5 ) exp(isj5 !k3 ), Z(sj6 ) exp( isj6 !k3 )], where j = (j1 , . . . , j6 ). 63 (64) Bound for B222 e a, (g; 0)). The set D is split into four sets, D6 We will show that B222 is the leading term in cum3 (Q where all the elements of j are di↵erent, and for 3  i  5, Di where i elements in j are di↵erent, such that B2,2,2 3 1 X X = 6 n i=0 j2D6 i X j A2,2,2 (⇡(2,2,2) ). ⇡(2,2,2) 2P(2,2,2) We start by bounding the partition given in (62), we later explain how the same bounds can be obtained for other indecomposable partitions in P2,2,2 . By using the spectral representation of the covariance and that |D6 | = O(n6 ), it is straightforward to show that = 1 X j A2,2,2 (⇡(2,2,2),1 ) n6 j2D6 Z Z X c6 (2⇡)3 6 R3 [ /2, k1 ,k2 ,k3 exp(i!k2 (s3 exp(ix(s1 /2]6 s4 )) exp(i!k3 (s5 s3 )) exp(iy(s4 s2 )) ⇥ f (x)f (y)f (z) exp(i!k1 (s1 s6 )) s6 )) exp(iz(s2 s5 )) 3 Y ds2j 1 ds2j dxdydz j=1 ✓ ◆ ✓ ◆ X Z c6 x z = f (x)f (y)f (z)sinc + k1 ⇡ sinc k1 ⇡ ⇥ (2⇡)3 2 2 3 R k1 ,k2 ,k3 ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ x y z y sinc k2 ⇡ sinc k2 ⇡ sinc k3 ⇡ sinc + k3 ⇡ dxdydz, 2 2 2 2 where c6 = n(n 1) . . . (n z = 2z k3 ⇡ we have = 5)/n6 . By changing variables x = 1 X j A2,2,2 (⇡(2,2,2),1 ) n6 j2D6 X 1 Z c6 2x f( 3 3 (2⇡) R3 !k1 )f ( 2y + !k2 )f ( 2z x 2 + k1 ⇡, y = y 2 (65) k2 ⇡ and + !k3 )sinc(x)sinc(z)sinc(y) k1 ,k2 ,k3 sinc(x (k2 + k1 )⇡)sinc(y + (k3 + k2 )⇡)sinc(z (k1 k3 )⇡)dxdydz. (66) In order to understand how this case can generalise to other partitions in P1 , we represent the ks inside the sinc function using the the linear equations 0 10 1 1 1 0 k1 B CB C (67) @ 1 0 1 A @ k2 A , 0 1 1 k3 64 where we observe that the above is a rank two matrix. Based on this we make the following change of variables k1 = k1 , m1 = k2 + k1 and m2 = k1 k3 , and rewrite the sum as 1 X j A2,2,2 (⇡(2,2,2),1 ) n6 j2D6 ✓ X 1 Z c6 2x f 3 3 (2⇡) R3 = k1 ,m1 ,m2 ◆ ✓ 2y ! k1 f + !m 1 ◆ ✓ 2z + ! k1 k1 f m2 ◆ sinc(x)sinc(z)sinc(y) sinc(x m1 ⇡)sinc(y (m2 m1 )⇡)sinc(z m2 ⇡)dxdydz Z c6 X = sinc(x)sinc(x m1 ⇡)sinc(y)sinc(y + (m1 m2 )⇡)sinc(z)sinc(z 2⇡2 {z R3 | m ,m 1 2 ✓ 1X 2x ⇥ f k1 ◆ ✓ 2y ! k1 f + ! k1 contains two independent m terms ◆ ✓ 2z m1 f ! k1 m1 ◆ dxdydz. m2 ⇡) } (68) Finally, we apply Lemma C.1(iv) to give ✓ 2 ◆ 1 X j log a A2,2,2 (⇡(2,2,2),1 ) = O . 6 2 n (69) j2D6 The above only gives the bound for one partition of P1 , but we now show that the same bound applies to all the other partitions. Looking back at (66) and comparing with (68), the reason that only one of the three s in the denominator of (66) gets ‘swallowed’ is because the matrix in (67) has rank two. Therefore, there are two independent ms in the sinc function of (66), thus by applying P Lemma C.1(iv) the sum m1 ,m2 only grows with rate O(log2 a). Moreover, it can be shown that all indecomposable partitions of P2,2,2 correspond to rank two matrices (for a proof see equation (A.13) in Deo and Chen (2000)). Thus all indecomposable partitions in P2,2,2 will have the same order, which altogether gives ✓ 2 ◆ X 1 X log a j A2,2,2 (⇡(2,2,2) ) = O . 2 n6 j2D6 ⇡(2,2,2) 2P1 Now we consider the case that j 2 D5 . In this case, there are two ‘typical’ cases j = (j1 , j2 , j3 , j4 , j1 , j6 ), which gives j A2,2,2 (⇡(2,2,2),1 ) = ⇥ ⇤ ⇥ cum Z(sj1 )eisj1 !k1 , Z(sj3 )eisj3 !k2 cum Z(sj2 )e isj2 !k1 and j = (j1 , j2 , j1 , j4 , j5 , j6 ), which gives ⇤ ⇥ , Z(sj1 )eisj1 !k3 cum Z(sj4 )e isj4 !k2 , Z(sj6 )e isj6 !k3 isj4 !k2 , Z(sj6 )e isj6 !k3 ⇤ j A2,2,2 (⇡(2,2,2),1 ) = ⇥ ⇤ ⇥ cum Z(sj1 )eisj1 !k1 , Z(sj1 )eisj1 !k2 cum Z(sj2 )e isj2 !k1 ⇤ ⇥ , Z(sj5 )eisj5 !k3 cum Z(sj4 )e j Using the same method used to bound (69), when j = (j1 , j2 , j3 , j4 , j1 , j6 ) A2,2,2 (⇡(2,2,2),1 ) = 2 O( log2 a ). However, when j = (j1 , j2 , j1 , j4 , j5 , j6 ) we use the same proof used to prove (59) to 65 ⇤ . j give A2,2,2 (⇡(2,2,2),1 ) = O( 1 ). As we get similar expansions for all j 2 D5 and |D5 | = O(n5 ) we have 1 X n6 X ✓ ◆ 1 log2 (a) + . n n 2 ✓ ◆ 1 log2 (a) + 2 2 . n2 n j A2,2,2 (⇡(2,2,2) ) = O j2D5 ⇡(2,2,2) 2P2,2,2 Similarly we can show that 1 X n6 X 1 X n6 X j A2,2,2 (⇡(2,2,2) ) =O j2D4 ⇡(2,2,2) 2P2,2,2 and j A2,2,2 (⇡(2,2,2) ) j2D3 ⇡(2,2,2), 2P2,2,2 ✓ ◆ 1 log2 (a) =O 3 + 3 2 . n n 2 Therefore, if n >> / log2 (a) we have B2,2,2 = O( log 2(a) ). Bound for B4,2 j To bound B4,2 we consider the ‘typical’ partition given in (63). Since A4,2 (⇡(4,2),1 ) involves fourth order cumulants by Lemma D.2(ii) it will be zero in the case that the j are all di↵erent. Therefore, only a maximum of five terms in j can be di↵erent, which gives 3 1 X j 1 X X j A (⇡ ) = A4,2 (⇡(4,2),1 ). 4,2 (4,2),1 n6 n6 i=1 j2D6 j2D i j We will show that for j 2 D5 , A4,2 (⇡(4,2),1 ) will not be as small as O(log2 (a)/ 2 ), however, this will be compensated by |D5 | = O(n5 ) (noting that |D6 | = O(n6 )). Let j = (j1 , j2 , j3 , j4 , j1 , j6 ), then j expanding the fourth order cumulant in A4,2 (⇡(4,2),1 ) and using conditional cumulants (see (56)) we have j A4,2 (⇡(4,2),1 ) a X = cum[Z(sj4 ) exp( isj4 !k2 ), Z(sj6 ) exp( isj6 !k3 )] k1 ,k2 ,k3 = a cum[Z(sj1 ) exp(isj1 !k1 ), Z(sj2 ) exp( isj2 !k1 ), Z(sj3 ) exp(isj3 !k2 ), Z(sj5 ) exp(isj5 !k3 )] ⇢ a X ⇥ ⇤ = cum c(s1 s2 )ei!k1 (s1 s2 ) , c(s3 s1 )eis3 !k2 +is1 !k3 k1 ,k2 ,k3 = a ⇥ +cum c(s1 s3 )ei(s1 !k1 +s3 !k2 ) , c(s1 ⇥ +cum c(0)eis1 !k1 +is1 !k3 , c(s2 s3 )e j j s2 )eis1 !k3 is2 !k1 +is3 !k2 j is2 !k1 ⇤ ⇤ ⇥ E c(s4 = A4,2 (⇡(4,2),1 , ⌦1 ) + A4,2 (⇡(4,2),1 , ⌦2 ) + A4,2 (⇡(4,2),1 , ⌦3 ), 66 s6 )e is4 !k2 is6 !k3 ⇤ (70) where we use the notation ⌦ to denote the partition of the fourth order cumulant into it’s conditional cumulants. To bound each term we expand the covariances as expectations, this gives j A4,2 (⇡(4,2),1 , ⌦1 ) ⇢ a X ⇥ = E c(s1 s1 )ei!k1 (s1 s2 )c(s3 s2 ) is3 !k2 +is1 !k3 e k1 ,k2 ,k3 = a ⇥ E c(s1 s2 )ei!k1 (s1 s2 ) j ⇤ ⇥ E c(s3 j ⇤ ⇥ s1 )eis3 !k2 +is1 !k3 E c(s4 ⇤ ⇥ E c(s4 s6 )e s6 )e is4 !k2 is6 !k3 is4 !k2 is6 !k3 = A4,2 (⇡(4,2),1 , ⌦1 , ⇧1 ) + A4,2 (⇡(4,2),1 , ⌦1 , ⇧2 ), ⇤ ⇤ where we use the notation ⇧ to denote the expansion of the cumulants of the spatial covariances j expanded into expectations. To bound A4,2 (⇡(4,2),1 , ⌦1 , ⇧1 ) we use the spectral representation theorem to give j A4,2 (⇡(4,2),1 , ⌦1 , ⇧1 ) = ✓ ◆ ✓ ◆ Z a X 1 (x y) x f (x)f (y)f (z)sinc + (k1 + k3 )⇡ sinc + k1 ⇡ ⇥ (2⇡)3 R3 2 2 k1 ,k2 ,k3 = a ✓ ◆ ✓ ◆ ✓ ◆ y z z sinc + k2 ⇡ sinc k2 ⇡ sinc + k3 ⇡ dxdydz. 2 2 2 By changing variables j A4,2 (⇡(4,2),1 , ⌦1 , ⇧1 ) = 1 ⇡3 3 a X Z 3 k1 ,k2 ,k3 = a R sinc(u f ✓ 2u ◆ ✓ 2v ! k1 f ◆ ✓ 2w ! k2 f v + (k2 + k3 )⇡)sinc(u)sinc(v)sinc(w)sinc(w Just as in the bound for B2,2,2 , we represent the equations 0 10 0 1 1 B CB @ 0 0 0 A@ 0 1 1 ! k3 ◆ ⇥ (k2 + k3 )⇡)dudvdw. ks inside the sinc function as a set of linear 1 k1 C k2 A , k3 (71) observing the matrix has rank one. We make a change of variables m = k2 + k3 , k1 = k1 and k2 = k2 to give j A4,2 (⇡(4,2),1 , ⌦1 , ⇧1 ) ✓ ◆ ✓ ◆ ✓ ◆ Z a X 1 1 2u 2v 2w = f ! k1 f ! k2 f ! m 1 k2 ⇥ ⇡ 3 R3 2 k1 ,k2 = a X sinc(u v + m⇡)sinc(u)sinc(v)sinc(w)sinc(w m⇡)dudvdw = m 23 Z X R3 m G ,m ✓ 2u 2v 2w , , ◆ sinc(u v + m⇡)sinc(u)sinc(v)sinc(w)sinc(w 67 m⇡)dudvdw, where G gives ,m ( 2u 2v 2w , , j |A4,2 (⇡(4,2),1 , ⌦1 , ⇧1 )| 1 )=  C 2 Z Pa k1 ,k2 = a f ( X R3 m |sinc(u 2u !k1 )f ( 2v !k2 )f ( 2w v + m⇡)sinc(w !m k2 ). Taking absolutes m⇡)sinc(u)sinc(v)sinc(w)|dudvdw Since the above contains m in the sinc function we use Lemma C.1(i), equations (42) and (43), to show Z X C j |A4,2 (⇡(4,2),1 , ⌦1 , ⇧1 )|  |sinc(u v + m⇡)sinc(w m⇡)sinc(u)sinc(v)sinc(w)|dudvdw R3 m  CX 1 `1 (m⇡)`2 (m⇡) = O( ), m j where the functions `1 (·) and `2 (·) are defined in Lemma C.1(i). Thus |A4,2 (⇡(4,2),1 , ⌦1 , ⇧1 )| = O( 1 ). j 2 We use the same method used to bound (69) to show that A4,2 (⇡(4,2),1 , ⌦1 , ⇧2 ) = O( log 2(a) ) and j |A4,2 (⇡(4,2),1 , ⌦2 )| = O( 1 ). Furthermore, it is straightforward to see that by the independence of s1 j j and s2 and s3 that A4,2 (⇡(4,2),1 , ⌦3 ) = 0 (recalling that A4,2 (⇡(4,2),1 , ⌦3 ) is defined in equation (70)). j Thus altogether we have for j = (j1 , j2 , j3 , j4 , j1 , j6 ) and the partition ⇡(4,2),1 , that |A4,2 (⇡(4,2),1 )| = O( 1 ). However, it is important to note for all other j 2 D5 and partitions in P4,2 the same method will lead to a similar decomposition given in (70) and the rank one matrix given in (71). The rank j one matrix means one ‘free’ m in the sinc functions and this |A4,2 (⇡4,2 )| = O( 1 ) for all j 2 D5 and ⇡4,2 2 P4,2 . Thus, since |D5 | = O(n5 ) we have 1 X n6 X j A4,2 (⇡2 ) j2D5 ⇡4,2 2P4,2 =O ✓ 1 log2 a + 2 n n ◆ =O ✓ log2 a 2 ◆ if n >> / log2 (a) (ie. n log2 a ! 0). For j 2 D4 and j 2 D3 we use the same argument, noting that the number of free m’s in the sinc functions goes down but to compensate, |D4 | = O(n4 ) and |D3 | = O(n3 ). Therefore, if n >> log2 (a)/ , then B4,2 = 3 1 X X n6 i=1 j2D6 i X j A4,2 (⇡2 ) = O ⇡4,2 2P4,2 ✓ log2 a 2 ◆ . Bound for B6 j Finally, we bound B6 . By using Lemma D.2(ii) we observe that A6 (k1 , k2 , k3 ) = 0 if more than four elements of j are di↵erent. Thus 3 1 X X j B6 = 6 A6 . n i=2 j2D6 68 i We start by considering the case that j = (j1 , j2 , j1 , j4 , j1 , j6 ) (three elements in j are the same), then by using conditional cumulants we have j A6 a X = = cum[Z(s1 )eis1 !k1 , Z(s2 )e k1 ,k2 ,k3 = a a X X is2 !k1 , Z(s1 )eis1 !k2 , Z(s4 )e is4 !k2 , Z(s1 )eis1 !k3 , Z(s6 )e is6 !k3 j A6 (⌦), k1 ,k2 ,k3 = a ⌦2R where R is the set of all pairwise partitions of {1, 2, 1, 4, 1, 6}, for example ⇥ j A6 (⌦1 ) = cum c(s1 s2 )ei!k1 (s1 s2 ) , c(s1 s4 )ei!k2 (s1 s4 ) , c(s1 s6 )ei!k3 (s1 s6 ) ⇤ . We will first bound the above and then explain how this generalises to the other ⌦ 2 R and j 2 D4 . Expanding the above third order cumulant in terms of expectations gives j = A6 (⌦1 ) a X k1 ,k2 ,k3 = a ⇢ ⇥ E c(s1 s2 )ei!k1 (s1 s2 ) c(s1 s4 )ei!k2 (s1 s4 ) c(s1 s6 )ei!k3 (s1 ⇥ ⇤ ⇥ ⇤ E c(s1 s2 )ei!k1 (s1 s2 ) E c(s1 s4 )ei!k2 (s1 s4 ) c(s1 s6 )ei!k3 (s1 s6 ) ⇥ ⇥ ⇤ E c(s1 s2 )ei!k1 (s1 s2 ) c(s1 s4 )ei!k2 (s1 s4 ) E c(s1 s6 )ei!k3 (s1 s6 ) ⇥ ⇤ ⇥ ⇤ E c(s1 s2 )ei!k1 (s1 s2 ) c(s1 s6 )ei!k3 (s1 s6 ) E c(s1 s4 )ei!k2 (s1 s4 ) + ⇥ ⇤ ⇥ ⇤ ⇥ ⇤ 2E c(s1 s2 )ei!k1 (s1 s2 ) E c(s1 s4 )ei!k2 (s1 s4 ) E c(s1 s6 )ei!k3 (s1 s6 ) = 5 X s6 ) ⇤ j A6 (⌦1 , ⇧` ). `=1 j j We observe that for 2  `  5, A6 (⌦1 , ⇧` ) resembles A4,2 (⇡(4,2),1 , ⌦1 , ⇧1 ) defined in (71), thus the j same proof used to bound the terms in (71) can be use to show that for 2  `  5 A6 (⌦1 , ⇧` ) = O( 1 ). j However, the first term A6 (⌦1 , ⇧1 ) involves just one expectation, and is not included in the previous cases. By using the spectral representation theorem we have j A6 (⌦1 , ⇧1 ) a h X = E c(s1 s2 )ei!k1 (s1 s2 ) c(s1 s4 )ei!k2 (s1 k1 ,k2 ,k3 = a Z Z Z a X 1 = (2⇡)3 k1 ,k2 ,k3 = a ✓ ◆ ✓ y sinc + k2 ⇡ sinc 2 Z Z a X 23 = (2⇡)3 3 k1 ,k2 ,k3 = a f (x)f (y)f (z)sinc ✓ ◆ z + k3 ⇡ dxdydz 2 Z 2u 2v f( !k1 )f ( sinc(u + v + w)sinc(u)sinc(v)sinc(w)dudvdw. 69 s4 ) c(s1 s6 )ei!k3 (s1 s6 ) i ◆ ✓ ◆ (x + y + z) x + (k1 + k2 + k3 )⇡ sinc + k1 ⇡ 2 2 !k2 )f ( 2w ! k3 ) ⇥ ] It is obvious that the ks within the sinc function correspond to a rank zero matrix, and thus j A6 (⌦1 , ⇧1 ) = O(1). Therefore, A6 (⌦1 ) = O(1). A similar bound holds for all j 2 D4 , this we have ✓ ◆ 1 X j 1 A (⌦ ) = O , 1 6 n6 n2 j2D4 since |D4 | = O(n4 ). Indeed, the same argument applies to the other partitions ⌦ and j 2 D3 , thus altogether we have ✓ ◆ 3 1 XX X 1 j B6 = 6 A6 (⌦) = O . n n2 ⌦2R i=2 j2D6 i Altogether, using the bounds derived for B2,2,2 , B4,2 and B6 we have ✓ 2 ◆ ✓ 2 ◆ log (a) 1 log2 (a) 1 log (a) e cum4 (Qa, (1, 0)) = O + + + 2 =O , 2 2 2 n n n ⇤ where the last bound is due to the conditions on a, n and . This gives the result. We now generalize the above results to higher order cumulants. Lemma D.4 Suppose Assumptions 2.1, 2.2, 2.4 and 2.5(b) are satisfied, and d = 1. Then for q 3 we have ✓ 2(q 2) ◆ log (a) e cumq [Qa, (g, r)] = O , (72) q 1 ✓ 2(q log e e cumq [Qa, (g, r1 ), . . . , Qa, (g, rq )] = O q 2) (a) 1 ◆ (73) and in the case d > 1, we have with d /(log2 (a)n) ✓ 2d(q log e e cumq [Qa, (g, r 1 ), . . . , Qa, (g, r q )] = O d(q ! 0 as 2) (a) 1) ◆ (74) ! 1, n ! 1 and a ! 1. PROOF. The proof essentially follows the same method used to bound the second and third cumulants. We first prove (72). To simplify the notation we prove the result for g = 1 and r = 0, e a, (1, 0)] and using noting that the proof in the general case is identical. Expanding out cumq [Q indecomposable partitions gives = = e a, (1, 0)] cumq [Q a X X 1 n2q j1 ,...,j2q 2D k1 ,...,kq = a h cum Z(sj1 )Z(sj2 )ei!k1 (sj1 1 XX X j A2b (⇡2b ) n2q j2D b2Bq ⇡b 2P2b 70 sj2 ) , . . . , Z(sj2q 1 )Z(sj2q )ei!kq (sj2q 1 sj2q ) i where Bq corresponds to the set of integer partitions of q (more precisely, each partition is a sequence of positive integers which sums to q). The notation used here is simply a generalization P of the notation used in Lemma D.3. Let b = (b1 , . . . , bm ) (noting j bj = q) denote one of these partitions, then P2b is the set of all indecomposable partitions of {(1, 2), (3, 4), . . . , (2q 1, 2q)} where the size of each partition is 2b1 , 2b2 , . . . , 2bm . For example, if q = 3, then one example of an element of B3 is b = (1, 1, 1) and P2b = P(2,2,2) corresponds to all pairwise indecomposable partitions j of {(1, 2), (3, 4), (5, 6)}. Finally, A2b (⇡2b ) corresponds to the product of one indecomposable par⇥ ⇤ tition of the cumulant cum Z(sj1 )Z(sj2 )ei!k1 (sj1 sj2 ) , . . . , Z(sj2q 1 )Z(sj2q )ei!kq (sj2q 1 sj2q ) , where the cumulants are of order 2b1 , 2b2 , . . . , 2bm (examples, in the case q = 3 are given in equation (62)-(64)). Let B2b = 1 X X j A2b (⇡2b ), 2q n j2D ⇡2b 2P2b e a, (1, 0)] = P therefore cumq [Q b2Bq B2b . Just as in the proof of Lemma D.3, we will show that under the condition n >> / log2 (a), the pairwise decomposition B2,...,2 is the denominating term. We start with a ‘typical’ decomposition ⇡(2,...,2),1 2 P2,...,2 , j A2,...,2 (⇡(2,...,2),1 ) = a X cum[Z(s1 )e is1 !k1 , Z(s2q )e is2q !kq ] qY1 is2c !kc cum[Z(s2c )e , Z(s2c+1 )eis2c+1 !kc+1 ]. c=1 k1 ,...,kq = a and q 1 1 X j 1 X X j A2,...,2 (⇡(2,...,2),1 ) = 2q A2,...,2 (⇡(2,...,2),1 ), 2q n n i=0 j2D2q j2D i where D2q denotes the set where all elements of j are di↵erent and D2q i denotes the set that (2q i) elements in j are di↵erent. We first consider the case that j = (1, 2, . . . , 2q) 2 D2q . Using e a, (1, 0)] we can show that e a, (1, 0)] and cum3 [Q identical arguments to those used for var[Q j A2,...,2 (⇡(2,...,2),1 ) = Z a X q k1 ,...,kq = a R qY1 f (xc )sinc c=1 f (xq )sinc ✓ ✓ 2xq ◆ 2xc ◆ + k1 ⇡ sinc kc ⇡ sinc ✓ 2xc ✓ kc+1 2xq ◆ + kq ⇡ ⇥ ◆Y q dxc . (75) c=1 By a change of variables we get j A2,...,2 (⇡(2,...,2),1 ) = 1 q a X Z qY1 q k1 ,...,kq = a R c=1 ⇥f ✓ uq 2 f ◆ ✓ ◆ uc + !c sinc(uc )sinc(uc 2 !1 sinc(uq )sinc(uq + (kq 71 k1 )⇡) (kc+1 q Y c=1 dxc . kc )⇡) As in the proof of the third order cumulant we can rewrite the ks in the above as a matrix equation 0 10 1 1 1 0 ... 0 0 k1 B CB C 1 1 . . . 0 0 C B k2 C B 0 B . C B C .. .. .. B . CB . C, . . . 0 0 A @ .. A @ . 1 0 ... ... 0 1 kq noting that that above is a (q 1)-rank matrix. Therefore applying the same arguments that were e a, (1, 0)] and also Lemma C.1(iii) we can show that Aj used in the proof of cum3 [Q 2,...,2 (⇡(2,...,2),1 ) = j log2(q 2) (a) log2(q 2) (a) 1 P O( ). Thus for j 2 D2q we have n2q j2D2q A2,...,2 (⇡(2,...,2),1 ) = O( ). q 1 q 1 In the case that j 2 D2q 1 ((2q 1)-terms in j are di↵erent) by using the same arguments as those 2(q 3) e a, (1, 0)]) we have Aj used to bound A2,2,2 (in the proof of cum[Q (⇡(2,...,2),1 ) = O( log q 2 (a) ), 2,...,2 j similarly if (2q 2)-terms in j are di↵erent, then A2,...,2 (⇡(2,...,2),1 ) = O( log Therefore, since |D2q i | = O(n2q i ) we have 2(q 4) (a) q 3 ) and so forth. q X 1 X j log2(q 2 i) (a) |A (⇡ )|  C . 2,...,2 (2,...,2),1 q 1 i ni n2q i=0 j2D Now by using that n >> / log2 (a) we have 1 X j |A2,...,2 (⇡(2,...,2),1 )| = O n2q log2(q 2) q 1 j2D (a) ! . The same argument holds for every other second order cumulant indecomposable partition, because the corresponding matrix will always have rank (q 1) in the case that j 2 D2q or for j 2 D2q i and the dependent sj ’s lie in di↵erent cumulants (see Deo and Chen (2000)), thus B2,...,2 = 2(q 1) O( log q 1 (a) ). Now, we bound the other extreme B2q . Using the conditional cumulant expansion (56) and noting that cum2q [Z(sj1 )eisj1 !k1 , . . . , Z(sj2q )e isj2q !kq ] is non-zero, only when at most (q + 1) elements of j are di↵erent we have B2q = 1 X n2q a X cum2q [Z(sj1 )eisj1 !k1 , . . . , Z(sj2q )e isj2q !kq ] j2D k1 ,...,kq = a = 1 X n2q X j A2q (⌦). j2Dq+1 ⌦2R2q where R2q is the set of all pairwise partitions of {1, 2, 1, 3, . . . , 1, q}. We consider a ‘typical’ partition ⌦1 2 R2q j A2q (⌦1 ) = a X k1 ,...,kq = a ⇥ cum c(s1 s2 )ei(s1 s2 )!k1 72 , . . . , c(s1 sq+1 )ei(s1 sq+1 )!kq ⇤ . (76) By expanding the above the cumulant as the sum of the product of expectations we have 1 X X X j j A2q (⌦1 ) = 2q A2q (⌦1 , ⇧), n j2Dq+1 ⌦2R2q ⇧2Sq e a, (1, 0)] and where Sq is the set of all partitions of {1, . . . , q}. As we have seen in both the var[Q e a, (1, 0)] calculations, the leading term in the cumulant expansion is the expectation over cum3 [Q all the covariance terms. The same result holds true for higher order cumulants, the expectation over all the covariances in that cumulant is the leading term because the it gives the linear equation of the ks in the sinc function with the lowest order rank (we recall the lower the rank the less ‘free’ s). Based on this we will only derive bounds for the expectation over all the covariances. Let ⇧1 2 Sq , where a X j A2q (⌦1 , ⇧1 ) = E k1 ,...,kq = a q ⇥Y sc+1 )ei(s1 c(s1 sc+1 )!kc c=1 ⇤ . Representing the above expectation as an integral and using the spectral representation theorem and a change of variables gives j A2q (⌦1 , ⇧1 ) = a X E = a X 1 (2⇡)q Z q k1 ,k2 ,k3 = a R Z a X 2q (2⇡)q q sc+1 )ei(s1 c(s1 sc+1 )!kc c=1 k1 ,...,kq = a = q ⇥Y sinc q k1 ,...,kq = a R ✓ sinc( ( Pq c=1 xc ) +⇡ 2 q X c=1 ⇤ q X kc c=1 ✓ q Y 2uc uc ) f ◆Y q f (xc )sinc(xc + kc ⇡) c=1 q Y dxc c=1 ◆ !kc sinc(uc )duc = O(1), c=1 j where the last line follows from Lemma C.1, equation (44). Therefore, A2q (⌦1 ) = O(1). By using the same method on every partition ⌦ 2 Rq+1 and j 2 Dq+1 and |Dq+1 | = O(nq+1 ), we have B2q = 1 X n2q X X Aj (⌦1 , ⇧1 ) = O( j2Dq+1 ⌦2R2q ⇧2S2q 1 nq 1 ). Finally, we briefly discuss the terms B2b which lie between the two extremes B2,...,2 and B2q . P Since B2b is the product of 2b1 , . . . , 2bm cumulants, by Lemma D.2(ii) at most m j=1 (bj +1) = q +m elements of j can be di↵erent. Thus B2b q+m 1 X X = 2q n X j A2b (⇡2b ). i=q j2Di ⇡2b 2P2b By expanding the cumulants in terms of the cumulants of covariances conditioned on the location (which is due to Gaussianity of the random field, see for example, (76)) we have B2b q+m 1 X X = 2q n X X i=q j2Di ⇡2b 2P2b ⌦2R2b 73 j A2b (⇡2b , ⌦), where R2b is the set of all paired partitions of {(1, . . . , 2b1 ), . . . , (2bm 1 + 1, . . . , 2bm )}. The leading terms are the highest order expectations. This term leads to a matrix equation for the k’s within the sinc functions, where the rank of the corresponding matrix is at least (m 1) (we do not give 2(m 1) 2(q 2) (a) a formal proof of this). Therefore, B2b = O( log ) = O( log q 1 (a) ) (since n >> / log2 (a)). nq m m 1 This concludes the proof of (72). The proof of (73) is identical and we omit the details. To prove the result for d > 1, (74) we use the same method, the main di↵erence is that the spectral density function in (75) is a multivariate function of dimension d, there are 2dp sinc functions and the integral is over Rdp , however the analysis is identical. ⇤ PROOF of Theorem 3.7 By using the well known identities 1 <cov(A, B) + <cov(A, B̄) 2 1 <cov(A, B) <cov(A, B̄) , 2 1 =cov(A, B) =cov(A, B̄) , 2 cov(<A, <B) = cov(=A, =B) = cov(<A, =B) = (77) and equation (22), we immediately obtain d d d e a, (g; 0)] = var[<Q h 1 (C1 (0) + <C2 (0)) + O(` 2 i e a, (g; r 1 ), <Q e a, (g; r 2 ) = cov <Q h i e a, (g; r 1 ), =Q e a, (g; r 2 ) = cov =Q and d h i 8 > < > : 8 > < > : e a, (g; r 1 ), =Q e a, (g; r 2 ) = cov <Q ( < 2 C1 (! r ) < 2 C2 (! r ) + O(` + O(` O(` ,a,n ) ,a,n ), ,a,n ) r 1 = r 2 (= r) ,a,n ) r 1 = r 2 (= r) otherwise < 2 C1 (! r ) + O(` ,a,n ) < 2 C2 (! r ) + O(` ,a,n )) O(` ,a,n ) O(` ,a,n ) + O(` = 2 C2 (! r ) r 1 = r 2 (= r) r 1 = r 2 (= r) otherwise r1 = 6 r2 ,a,n ) r 1 = r 2 (= r) Similar expressions for the covariances of <Qa, (g; r) and =Qa, (g; r) can also be derived. e a, (g; r) and =Q e a, (g; r) follows from Lemma D.4. Thus Finally, asymptotic normality of <Q giving (23). ⇤ PROOF of Theorem 3.4 The proof is identical to the proof Lemma D.4, we omit the details. ⇤ PROOF of Theorem 3.5 The proof is similar to the proof of Theorem 3.7, we omit the details. 74 E Additional proofs In this section we prove the remaining results required in this paper. For example, Theorems 3.1, 3.2, 3.3(i,iii), 3.6 and A.2, involve replacing sums with integrals. In the case that the frequency domain is increasing stronger assumptions are required than in the case of fixed frequency domain. We state the required result in the following lemma. Lemma E.1 Let us suppose the function g1 , g2 are bounded (sup!2Rd |g1 (!)| < 1 and sup!2Rd |g2 (!)| < 1 (!) 2 (!) 1) and for all 1  j  d, sup!2Rd | @g@! | < 1 and sup!2Rd | @g@! | < 1. j j (i) Suppose a/ = C (where C is a fixed finite constant) and h is a bounded function whose first partial derivative 1  j  d, sup!2Rd | @h(!) @!j | < 1. Then we have a 1 X d g1 (! k )h(! k ) k= a 1 (2⇡)d Z 2⇡[ C,C]d g1 (!)h(!)!  K 1 , where K is a finite constant independent of . (ii) Suppose a/ ! 1 as ! 1. Furthermore, h(!)  (!) and for all 1  j  d the partial derivatives satisfy | @h(!) (!). Then uniformly over a we have @!j |  a 1 X d 1 (2⇡)d g1 (! k )h (! k ) k= a Z 2⇡[ a/ ,a/ ]d g1 (!)h (!) d!  K 1 (iii) Suppose a/ ! 1 as ! 1. Furthermore, f4 (! 1 , ! 2 , ! 3 )  (! 1 ) (! 2 ) (! 3 ) and for 1 ,! 2 ,! 3 ) all 1  j  3d the partial derivatives satisfy | @f4 (!@! |  (!). j 1 2d a X g1 (! k1 )g2 (! k2 )f4 (! k1 +r1 , ! k2 , ! k2 +r2 ) k1 ,k2 = a 1 (2⇡)d Z 2⇡[ a/ ,a/ ]d Z 2⇡[ a/ ,a/ ]d g1 (! 1 )g2 (! 2 )f4 (! 1 + ! r1 , ! 2 , ! 2 + !r2 )  K 1 . PROOF. We first prove the result in the univariate case. We expand the di↵erence between sum and integral Z 2⇡a/ a 1 X 1 g1 (!k )h(!k ) g1 (!)h(!)d! 2⇡ 2⇡a/ k= a = a 1 X g1 (!k )h(!k ) k= a a 1 Z 1 X 2⇡/ g1 (!k + !)h(!k + !)d!. 2⇡ 0 k= a By applying the mean value theorem for integrals to the integral above we have a a 1 Z 1 X 1 X 2⇡/ = g1 (!k )h(!k ) g1 (!k + !)h(!k + !)d! 2⇡ 0 = 1 k= a a X k= a [g1 (!k )h(!k ) g1 (!k + !k )h(!k + !k )] k= a 75 where !k 2 [0, 2⇡ ]. Next, by applying the mean value theorem to the di↵erence above we have a 1 X [g1 (!k )h(!k ) k= a a 1 X ⇥ g1 (!k + !k )h(!k + !k )]  2 g10 (e !k )h(e !k ) + g1 (e !k )h0 (e !k ) k= a ⇤ (78) where ! ek 2 [!k , !k + !] (note this is analogous to the expression given in Brillinger (1981), Exercise 1.7.14). Under the condition that a = C , g1 (!) and h(!) are bounded and using (78) it is clear that a 1 X 1 2⇡ g1 (!k )h(!k ) k= a  Z 2⇡a/ g1 (!)h(!)d! 2⇡a/ a sup! |h0 (!)g1 (!)| + sup! |h(!)g10 (!)| X 2 1 1=C . k= a For d = 1, this proves (i). In the case that a/ ! 1 as ! 1, we use that h and h0 are dominated by a montonic function and that g1 is bounded. Thus by using (78) we have a 1 X g1 (!k )h(!k ) k= a  2(sup! |g1 (!)| + 2 1 2⇡ Z 2⇡a/ 2⇡a/ sup! |g10 (!)|) a X k=0 g1 (!)h(!)d!  (!k )  C Z 1 a 1 X 2 k= a sup !k !!k+1 (!)d! = O( 1 (|g10 (!)h(!)| + |g1 (!)h0 (!)|) ). 0 For d = 1, this proves (ii). To prove the result for d = 2 we take di↵erences, that is Z a/ Z a/ a a 1 X X 1 g(!k1 , !k2 )h(!k1 , !k2 ) g(!1 , !2 )h(!1 , !2 ) 2 (2⇡)2 a/ a/ k1 = a k2 = a 0 1 Z a/ a a 1 X @1 X 1 = g(!k1 , !k2 )h(!k1 , !k2 ) g(!k1 , !2 )h(!k1 , !2 )A (2⇡) a/ k1 = a k2 = a 0 1 Z a/ Z a/ a X @1 + g(!k1 , !2 )h(!k1 , !2 ) g(!1 , !2 )h(!1 , !2 )A . a/ a/ k1 = a For each of the terms above we apply the method described for the case d = 1; for the first term we take the partial derivative over !2 and the for the second term we take the partial derivative over !1 . This method can easily be generalized to the case d > 2. The proof of (iii) is identical to the proof of (ii). We mention that the assumptions on the derivatives (used replace sum with integral) can be relaxed to that of bounded variation of the function. However, since we require the bounded derivatives to decay at certain rates (to prove other results) we do not relax the assumption here. ⇤ We now prove the claims at the start of Appendix A.2. 76 Lemma E.2 Suppose Assumptions 2.1, 2.2 and (i) Assumptions 2.4(i) and 2.4(a,c) hold. Then we have ( h i C1 (! r ) + O( 1 + d e a, (g; r 1 ), Q e a, (g; r 2 ) = cov Q d O( 1 + n ) d n ) r 1 = r 2 (= r) r 1 6= r 2 and d where h i e a, (g; r 1 ), Q e a, (g; r 2 ) = cov Q ( C2 (! r ) + O( 1 + d O( 1 + n ) d n ) r1 = r 1 6= r 2 (= r) r2 C1 (! r ) = C1,1 (! r ) + C1,2 (! r ) and C2 (! r ) = C2,1 (! r ) + C2,2 (! r ), (79) with C1,1 (! r ) = C1,2 (! r ) = C2,1 (! r ) = C2,2 (! r ) = Z 1 f (!)f (! + ! r )|g(!)|2 d!, (2⇡)d 2⇡[ a/ ,a/ ]d Z 1 f (!)f (! + ! r )g(!)g( ! ! r )d!, (2⇡)d Dr Z 1 f (!)f (! + ! r )g(!)g( !)d!, (2⇡)d 2⇡[ a/ ,a/ ]d Z 1 f (!)f (! + ! r )g(!)g(! + ! r )d!, (2⇡)d Dr where the integral is defined as R R 2⇡ min(a,a r1 )/ R 2⇡ min(a,a Dr = 2⇡ max( a, a r1 )/ . . . 2⇡ max( a, rd )/ a rd )/ (80) (note that C1,1 (! r ) and C1,2 (! r ) are real). (ii) Assumptions 2.4(ii) and 2.5(b) hold. Then d d where h i e a, (g; r 1 ), Q e a, (g; r 2 ) = A1 (r 1 , r 2 ) + A2 (r 1 , r 2 ) + O( cov Q h i e a, (g; r 1 ), Q e a, (g; r 2 ) = A3 (r 1 , r 2 ) + A4 (r 1 , r 2 ) + O( cov Q A1 (r 1 , r 2 ) = 1 ⇡ 2d d 2a X A2 (r 1 , r 2 ) = ⇡ 2d d X 2d m= 2a k=max( a, a+m) R g(! k )g(! k 1 Z min(a,a+m) 2a X ! m )Sinc(u Z min(a,a+m) X ! k )Sinc(u 2u ! k )f ( 2v m⇡)Sinc(v + (m + r 1 2d m= 2a k=max( a, a+m) R g(! k )g(! m f( f( 2u ! k )f ( 2v d n ), d n ), + ! k + ! r1 ) r 2 )⇡)Sinc(u)Sinc(v)dudv + ! k + ! r1 ) (m + r 2 )⇡)Sinc(v + (m + r 1 )⇡)Sinc(u)Sinc(v)dudv 77 A3 (r 1 , r 2 ) = 1 ⇡ 2d d 2a X 1 ⇡ 2d X m= 2a k=max( a, a+m) g(! k )g(! m A4 (r 1 , r 2 ) = min(a,a+m) d 2a X R2d f( 2u ! k )f ( 2v + ! k + ! r1 ) ! k )Sinc(u + m⇡)Sinc(v + (m + r 2 + r 1 )⇡)Sinc(u)Sinc(v)dudv min(a,a+m) X m= 2a k=max( a, a+m) g(! k )g(! k Z ! m )Sinc(u Z R2d (m f( 2u ! k )f ( 2v + ! k + ! r1 ) r 2 )⇡)Sinc(v + (m + r 1 )⇡)Sinc(u)Sinc(v)dudv where k = max( a, a m) = {k1 = max( a, a m1 ), . . . , kd = max( a, a k = min( a + m) = {k1 = max( a, a + m1 ), . . . , kd = min( a + md )}. md )}, (iii) Assumptions 2.4(ii) and 2.5(b) hold. Then we have h i h i d d e a, (g; r 1 ), Q e a, (g; r 2 ) < 1 and e a, (g; r 1 ), Q e a, (g; r 2 ) < 1, sup cov Q sup cov Q a if d /n a ! c (0  c < 1) as ! 1 and n ! 1. PROOF We prove the result in the case d = 1 (the proof for d > 1 is identical). We first prove (i). By using indecomposable partitions, Theorem 2.1 and Lemma D.1 in Subba Rao (2015b), noting that the fourth order cumulant is of order O(1/n), it is straightforward to show that h i e a, (g; r1 ), Q e a, (g; r2 ) cov Q  ✓ a n d X 1 X = g(!k1 )g(!k2 ) cov Jn (!k1 )Jn (!k1 +r1 ) Z(sj )2 e isj !r , n j=1 k1 ,k2 = a ◆ n d X Jn (!k2 )Jn (!k2 +r2 ) Z(sj )2 e isj !r n j=1 8 P a 1 > k= a f (!k )f (!k + !r )g(!k )g(!k ) < P min(a,a r) 1 a = + k=max( a, a r) f (!k )f (!k + !r )g(!k )g( !k+r ) + O( 2 + n ) r1 = r2 > : O( a2 + n ) r1 6= r2 Since a = C we have O( a2 ) = O( 1 ) and by replacing sum with integral (using Lemma E.1(i)) we have ( h i C1 (!r ) + O( 1 + n ) r1 = r2 (= r) e a, (g; r1 ), Q e a, (g; r2 ) = cov Q O( 1 + n ) r1 6= r2 where C1 (!r ) = C11 (!r ) + C12 (!r ) + O with C11 (!r ) = C12 (!r ) = 1 2⇡ 1 2⇡ Z Z ✓ ◆ 1 , 2⇡a/ f (!)f (! + !r )g(!)g(!)d! 2⇡a/ 2⇡ min(a,a r)/ f (!)f (! + !r )g(!)g( ! 2⇡ max( a, a r)/ 78 !r )d!. We now show that C1 (!r ) = C11 (!r ) + C12 (!r ). It is clear that C11 (!r ) is real. Thus we focus on C12 (!r ). To do this, we write g(!) = g1 (!) + ig2 (!), therefore =g(!)g( ! !r ) = [g2 (!)g1 ( ! !r ) g1 (!)g2 ( ! !r )]. Substituting the above into =C12 (!r ) gives us =C12 (r) = [C121 (r) + C122 (r)] where 1 2⇡ C121 (r) = Z 1 2⇡ C122 (r) = 2⇡ min(a,a r)/ f (!)f (! + !r )g2 (!)g1 ( ! 2⇡ max( a, a r)/ Z 2⇡ min(a,a r)/ We will show that C122 (r) = u = ! !r gives us C122 1 = 2⇡ Z f (!)f (! + !r )g1 (!)g2 ( ! !r )d! !r )d! 2⇡ max( a, a r)/ C121 (r). Focusing on C122 and making the change of variables 2⇡ max( a, a r)/ f (u + !r )f ( u)g1 ( u !r )g2 (u)du, 2⇡ min(a,a r)/ R 2⇡ max( a, a r)/ noting that the spectral density function is symmetric with f ( u) = f (u), and that 2⇡ min(a,a r)/ = R 2⇡ min(a,a r)/ 2⇡ max( a, a r)/ . Thus we have =C12 (r) = 0, which shows that C1 (!r ) is real. The proof of d cov[Q e e a, (g; r 2 )] is the same. Thus, we have proven (i). (g; r 1 ), Q e a, (g; r1 ), Q e a, (g; r2 )] to give To prove (ii) we first expand cov[Q = a, e a, (g; r1 ), Q e a, (g; r2 )] cov[Q a n X X g(!k1 )g(!k2 ) j1 ,...,j4 =1 k1 ,k2 = a ✓ ⇥ E c(sj1 sj3 )eisj1 !k1 isj3 !k2 ⇤ ⇥ E c(sj2 sj4 )e isj2 !k1 +r1 +isj4 !k2 +r2 ⇥ ⇤ ⇥ ⇤ E c(sj1 sj4 )eisj1 !k1 +isj4 !k2 +r2 E c(sj2 sj3 )e isj2 !k1 +r1 isj3 !k2 + ◆ ⇥ ⇤ i!k1 sj1 isj2 (!k1 +!r2 ) i!k2 sj3 isj4 (!k2 +!r2 ) cum Z(sj1 )e , Z(sj2 )e , Z(sj3 )e , Z(sj4 )e . By applying Lemma D.1, (59) in Subba Rao (2015b), to the fourth order cumulant we can show that ✓ ◆ h i e a, (r1 ), Q e a, (g; r2 ) = A1 (r1 , r2 ) + A2 (r1 , r2 ) + O (81) cov Q n where A1 (r1 , r2 ) = a X k1 ,k2 = a ⇥ ⇥ g(!k1 )g(!k2 )E c(s1 s3 ) exp(is1 !k1 ⇤ is3 !k2 ) ⇥ ⇤ E c(s2 s4 ) exp( is2 !k1 +r1 + is4 !k2 +r2 ) a X ⇥ ⇤ A2 (r1 , r2 ) = g(!k1 )g(!k2 )E c(s1 s4 ) exp(is1 !k1 + is4 !k2 +r2 ) ⇥ k1 ,k2 = a ⇥ E c(s2 s3 ) exp( is2 !k1 +r1 79 ⇤ is3 !k2 ) . ⇤ + Note that the O( /n) term include the error n 1 [A1 (r1 , r2 )] + A2 (r1 , r2 )] (we show below that A1 (r1 , r2 ) and A2 (r1 , r2 ) are both bounded over and a). To write A1 (r1 , r2 ) in the form stated in the lemma we integrate over s1 , s2 , s3 and s4 to give = A1 (r1 , r2 ) a X g(!k1 )g(!k2 ) 1 4 k1 ,k2 = a Z [ /2, /2]4 c(s1 s3 )c(s2 s4 )eis1 !k1 is3 !k2 e is2 !k1 +r1 +is4 !k2 +r2 ds1 . . . ds4 . By using the spectral representation theorem and integrating out s1 , . . . , s4 we can write the above as = A1 (r1 , r2 ) a X Z 1 Z 1 ✓ x + k1 ⇡ 2 ◆ g(!k1 )g(!k2 ) f (x)f (y)sinc (2⇡)2 1 1 k1 ,k2 = a ✓ ◆ ✓ ◆ ✓ ◆ y x y sinc (r1 + k1 )⇡ sinc + k2 ⇡ sinc (r2 + k2 )⇡ dxdy 2 2 2 Z 1Z 1 a X 1 2u 2v = g(!k1 )g(!k2 ) f( !k1 )f ( + !k1 + !r1 ) 2 ⇡ 1 1 k1 ,k2 = a ⇥sinc(u)sinc(u + (k2 k1 )⇡)sinc(v)sinc(v + (k1 k 2 + r1 r2 )⇡)dudv, where the second equality is due to the change of variables u = 2x + k1 ⇡ and v = 2y (r1 + k1 )⇡. Finally, by making a change of variables k = k1 and m = k1 k2 (k1 = k2 + m) we obtain the expression for A1 (r1 , r2 ) given in Lemma E.2. A similar method can be used to obtain the expression for A2 (r1 , r2 ) = A2 (r1 , r2 ) a X Z 1 Z 1 ✓ x + k1 ⇡ 2 ◆ g(!k1 )g(!k2 ) f (x)f (y)sinc (2⇡)2 1 1 k1 ,k2 = a ✓ ◆ ✓ ◆ ✓ ◆ y y x sinc (r1 + k1 )⇡ sinc + k2 ⇡ sinc (r2 + k2 )⇡ dxdy 2 2 2 Z 1Z 1 a X 1 2u 2v = g(!k1 )g(!k2 ) f( !k1 )f ( + !k1 + !r1 ) 2 ⇡ 1 1 k1 ,k2 = a ⇥sinc(u)sinc(u (k2 + r2 + k1 )⇡)sinc(v)sinc(v + (k2 + k1 + r1 )⇡)dudv. By making the change of variables k = k1 and m = k1 + k2 (k1 = expression for A2 (r1 , r2 ). Finally following the same steps as those above we obtain k2 + m) we obtain the stated e a, (g; r1 ), Q e a, (g; r2 )) = A3 (r1 , r2 ) + A4 (r1 , r2 ) + O( ), cov(Q n 80 where a X 1 Z ei(s1 !k1 +s3 !k2 ) e is2 !k1 +r1 is4 !k2 +r2 ⇥c(s1 s3 )c(s2 s4 )ds1 ds2 ds3 ds4 Z a X 1 A4 (r1 , r2 ) = g(!k1 )g(!k2 ) ei(s1 !k1 +s3 !k2 ) e 3 is2 !k1 +r1 is4 !k2 +r2 A3 (r1 , r2 ) = 3 g(!k1 )g(!k2 ) k1 ,k2 = a [ k1 ,k2 = a ⇥c(s1 /2, /2]4 [ s4 )c(s2 /2, /2]4 s3 )ds1 ds2 ds3 ds4 . Again by replacing the covariances in A3 (r1 , r2 ) and A4 (r1 , r4 ) with their spectral representation gives (ii) for d = 1. The result for d > 1 is identical. It is clear that (iii) is true under Assumption 2.4(a,b). To prove (iii) under Assumption 2.5(a,b) we will show that for 1  j  4, supa |Aj (r 1 , r 2 )| < 1. To do this, we first note that by the Cauchy Schwarz inequality we have sup a, 1 ⇡ 2d min( a+m) X 1 d g(!k )g(!k ! m )f ( 2u ! k )f ( 2v + ! k + ! r1 ) k=max( a, a+m) 2  C sup |g(!)| kf k22 , ! where kf k2 is the L2 norm of the spectral density function and C is a finite constant. Thus by taking absolutes of A1 (r 1 , r 2 ) we have |A1 (r 1 , r 2 )|  C sup |g(!)| ! 2 kf k22 Z 1 X 2d m= 1 R |Sinc(u m⇡)Sinc(v + (m + r 1 r 2 )⇡)Sinc(u)Sinc(v)|dudv. Finally, by using Lemma C.1(iii), Subba Rao (2015b), we have that supa |A1 (r 1 , r 2 )| < 1. By using the same method we can show that supa |A1 (r 1 , r 2 )|, . . . , supa |A4 (r 1 , r 4 )| < 1. This completes the proof. ⇤ PROOF of Theorem 5.1 Making the classical variance-bias decomposition we have ⇣ ⌘2 ⇣ ⌘2 d d e a, (g; 0)] = var[Ve ] + E[Ve ] e a, (g; 0)] . E Ve var[Q var[Q We first analysis the bias term, in particular E[Ve ]. We note that by using the expectation and variance result in Theorem 3.1 and equation (22), respectively, we have E[Ve ] = d |S| X r2S e a, (g; r)] + var[Q d |S| ✓ X r2S 2 =O( e a, (g; r)] E[Q | {z } 2d Qd j=1 (log 1 X [log + log M ] C1 (! r ) + O ` ,a,n + d |S| r2S ✓ ◆ [log + log M ]d |M | = C1 + O `a, ,n + + . d = 81 ◆ d +log |r j |)2 ) Next we consider var[Ve ], by using the classical cumulant decomposition we have h i2 h i 2d X ✓ e e a, (g; r 1 ), Q e a, (g; r 2 ) + cov Q e a, (g; r 1 ), Q e a, (g; r 2 ) var[V ] = cov Q |S|2 r 1 ,r 2 2S 2d + X ⇣ ⌘ e a, (g; r 1 ), Q e a, (g; r 1 ), Q e a, (g; r 2 ), Q e a, (g; r 2 ) cum Q |S|2 r 1 ,r 2 2S 0 2d X +O @ 2 |S| r 1 ,r 2 2S ⇣ ⌘ ⇣ ⌘ 2 1 e a, (g; r 1 ), Q e a, (g; r 1 ), Q e a, (g; r 2 ) E Q e a, (g; r 2 ) A . cum Q e a, (·) based on uniformly sampled locations By substituting the variance/covariance results for Q in (22) and the cumulant bounds in Lemma D.4 into the above we have ! ! h i2 4d 4d 2d X log (a) 1 log (a) e a, (g; r) + O ` ,a,n + var[Ve ] = var Q =O + ` ,a,n + . d d |S|2 |S| r2S ⇤ Thus altogether we have the result. F Sample properties of Qa, (g; r) In this section we summarize the result for Qa, (g; r). To simplify notation we state the results only for the case that the locations are uniformly distributed. We recall that where Vr = 1 n Pn j=1 Z(sj ) exp( e a, (g; r) + G Vr Qa, (g; r) = Q (82) i! r sj ). Theorem F.1 Suppose Assumptions 2.1(i), 2.2, b = b(r) denotes the number of zero elements in the vector r 2 Zd and (i) Assumptions 2.4(i) and 2.5(a,c) hold. Then we have = E [Qa, (g; r)] ( 1 (2⇡)d R !22⇡[ O( d1 b ) 1 C,C]d f (!)g(!)d! + O( + r 2 Zd /{0} n ) r =0 d (ii) Suppose the Assumptions 2.4(ii) and Assumption 2.5(b,c) hold and {m1 , . . . , md b } is the subset of non-zero values in r = (r1 , . . . , rd ), then we have E [Qa, (g; r)] 8 ⇣ ⌘ Q < O d1 b dj=1b (log + log |mj |) = R c(0) Pa log : 1d k= a g(! k ) + O !2Rd f (!)g(!)d! + n (2⇡) 82 +log krk1 + 1 n r 2 Zd /{0} r=0 . PROOF The proof of (i) immediately follows from Theorem 2.1. The proof of (ii) follows from writing Qa, (g; r) as a quadratic form and taking expectations E [Qa, (g; r)] Z a X 1 = c2 g(! k ) d k= a [ /2, /2]d c(s1 s2 ) exp(i! 0k (s1 s2 ) is02 ! r )ds1 ds2 + Wr , Pa where c2 = n(n 1)/n2 and Wr = c(0)I(r=0) k= a g(! k ) (I(r = 0) denotes the indicator variable). n We then follow the same proof used to prove Theorem 3.1. ⇤ Theorem F.2 [Asymptotic expression for variance] Suppose Assumptions 2.1, 2.2, 2.4 2.5(b,c) hold. Then we have ( C1 (! r ) + 2<[C3 (! r )G ] + 2f2 (! r )|G |2 + O(` ,a,n ) r 1 = r 2 (= r) d cov [Qa, (g; r 1 ), Qa, (g; r 2 )] = O(` ,a,n ) r 1 6= r 2 d h i cov Qa, (g; r 1 ), Qa, (g; r 2 ) = ( C2 (! r ) + 2G C3 (! r ) + 2f2 (! r )G2 + O(` O(` ,a,n ) where C1 (! r ) and C2 (! r ) are defined in (79), G = n1 Z 2 C3 (! r ) = g(!)f ( !)f (! + ! r )d! (2⇡)d 2⇡[ a/ ,a/ ]d Pa ,a,n ) r1 = r 1 6= ad = O(n), Z f2 (! r ) = f ( )f (! r r 2 (= r) , r2 k= a g(! k ), and Rd )d .(83) PROOF To prove the result we use (82). Using this expansion we rewrite cov[Qa, (g; r 1 ), Qa, (g; r 2 )] e a, (g; r) and Vr in terms of Q cov [Qa, (g; r 1 ), Qa, (g; r 2 )] = h i h i h i e a, (g; r 1 ), Q e a, (g; r 2 ) + |G |2 cov [Vr , Vr ] + G cov Q e a, (g; r 1 ), Vr + G cov Q e a, (g; r 2 ), Vr . cov Q 1 2 2 1 h i e a, (g; r 1 ), Q e a, (g; r 2 ) . We In Theorem A.1 we have evaluated an asymptotic expression for cov Q h i e a, (g; r 1 ), Vr . It is straightforward now evaluate similar expressions for cov [Vr1 , Vr2 ] and cov Q 2 to show that ( O( 1 )⇣ d ⌘ r 1 6= r 2 cov[Vr1 , Vr2 ] = (84) d 2f2 (! r1 ) + O 1 + n r1 = r2 h i e a, (g; r 1 ), Vr we consider the case d = 1. Using similar To evaluate an expression for cov Q 2 83 arguments to those in the proof of Lemma E.2 we can show that h i e a, (g; r 1 ), Vr cov Q 2 = 2 c3 a X h g(!k )E c(s1 s3 )c(s2 s3 )eis1 !k eis2 (y !k !r ) is3 ( x y !r ) e k= a a X i +O ✓ n2 ◆ ✓ ◆ ✓ ◆ ✓ ◆ Z 2c3 x x (x + y) = g(!k ) f (x)f (y)sinc + k⇡ sinc (k + r1 )⇡ sinc r2 ⇡ dxdy (2⇡)2 2 2 2 R3 k= a ✓ ◆ +O n2 ✓ ◆ ✓ ◆ Z a 2c3 X 2u 2v = g(!k ) f !k f + !k+r sinc (u) sinc (v) sinc (u + v + (r1 r2 )⇡) dudv ⇡2 R3 k= a ✓ ◆ +O , n2 where c3 = n(n 1)(n 2)/n3 . Finally by replacing sum with an integral, using Lemma E.1 and replacing f 2u ! f 2v + ! + !r with f ( !) f (! + !r ) and by using (54) we have h i e a, (g; r1 ), Vr cov Q 2 Z 2⇡a/ Z 2c3 = g(!)f ( !) f (! + !r1 ) d! sinc (u) sinc (v) sinc (u + v + (r1 r2 )⇡) dudv 2⇡⇡ 2 2⇡a/ R3 ✓ ◆ log2 +O + n2 8 ⇣ ⌘ < C3 (!r ) + O 2 + log2 + 1 r1 = r2 1 ⇣ d n 2 ⌘ n = (85) log : O n2 + + n1 r1 6= r2 Altogether, by using Theorem A.1, (85) and (84) we obtain the result. ⇤ In the following lemma we show that we further simplify the expression for the asymptotic variance if we keep r fixed. Corollary F.1 Suppose Assumption 2.4, 2.5(a,c) or 2.5(b,c) holds, and r is fixed. Let C3 (!) and f2 (!) be defined in (83). Then we have ✓ ◆ krk1 C3 (! r ) = C3 + O , and f2 (! r ) = f2 + O ⇣ krk1 ⌘ , where C3 = 2 (2⇡)d Z g(!)f (!)2 d! 2⇡[ a/ ,a/ ]d and f2 = f2 (0) (note, if g(!) = g( !), then C1 = C2 ). PROOF The proof is the same as the proof of Corollary 3.1. 84 ⇤ Theorem F.3 [CLT on real and imaginary parts] Suppose Assumptions 2.1, 2.2, 2.4 and 2.5(b,c) hold. Let C1 , C2 , C3 and f2 be defined as in Corollary F.1. We define the m-dimension complex random vectors Qm = (Qa, (g, r 1 ), . . . , Qa, (g, r m )), where r 1 , . . . , r m are such that r i 6= r j and r i 6= 0. Under these conditions we have ✓ ◆ 2 d/2 E1 P <Qa, (g, 0), <Qm , =Qm ! N 0, I2m+1 , (86) E1 E1 + <E2 where E1 = C1 + 2<[G C3 ] + 2f2 |G |2 and E2 = C2 + 2G C3 + 2f2 G2 with 2 log (a) 1/2 ! 0 as d n log2d (a) ! 0 and ! 1, n ! 1 and a ! 1. PROOF Using the same method used to prove Lemma D.4, and analogous results can be derived for the cumulants of Qa, (g; r). Asymptotic normality follows from this. We omit the details. ⇤ Application to nonparametric covariance estimator without bias correction In this section we apply the results to the nonparametric estimator considered in Example 2.2(ii); the estimator without bias correction and is a non-negative definite sequence. We recall that ! ✓ ◆ a 2v 1 X 2 0 b cn (v) = T e cn (v) where e cn (v) = |Jn (! k )| exp(iv ! k ) . d k= a and T is the d-dimensional triangle kernel. It is clear the asymptotic sampling properties of b cn (v) are determined by e cn (v). Therefore, we first derive the asymptotic sampling properties of e cn (v). 0· iv We observe that e cn (v) = Qa, (e ; 0), thus we use the results in Section 3 to derive the asymptotic sampling properties of e cn (v). By using Theorem 3.1(ii)(a,b) and Assumption 2.4(ii) we have ✓ ◆ Z 1 log 0 E[e cn (v)] = f (!) exp(iv )d! + O (2⇡)d 2⇡[ a/ ,a/ ]d ! ✓ ◆ Z 1 log = f (!) exp(iv 0 )d! + O + , (87) a (2⇡)d Rd where is such that it satisfies Assumption 2.4(ii)(a). Therefore, if ! 1, then by using Theorem F.3 we have d/2 where ⌃ = Pa (e cn (v) Z 1 (2⇡)d 2⇡[ Z 2G (2⇡)d 2⇡[ +d/2 /a D c(v)) ! N (0, ⌃) ! 0 as a ! 1 and (88) f (!)2 1 + exp(i2v 0 !) d! + a/ ,a/ ]d a/ ,a/ ]d f (!)2 exp(iv 0 !)d! + 2G2 f2 , and G = n1 k= a exp(iv 0 ! k ), which is real. We now derive the sampling properties of b cn (v). By using (87), (88) and properties of the triangle kernel we have ✓ ✓ ◆ ◆ log min1id |vi | max1id |vi | E[b cn (v)] = c(v) + O + 1 + a 85 and d/2 +d/2 /a with ! 0, ✓ b cn (v) d n log2d (a) T ✓ ! 0 and 2v ◆ c(v) log2 (a) 1/2 ◆ D !N 0, T ✓ 2v ◆2 ! ⌃ , ! 0 as a ! 1 and ! 1. Application to parameter estimation using an L2 criterion In this section we consider the asymptotic sampling properties of ✓bn = arg min✓2⇥ Ln (✓), where Ln (·) is defined in (10) and ⇥ is a compact set. We will assume that there exists a ✓0 2 ⇥, such that for all ! 2 Rd , f✓0 (!) = f (!) and there does not exist another ✓ 2 ⇥ such that for all ! 2 Rd R f✓0 (!) = f✓ (!) and in addition Rd kr✓ f (!; ✓0 )k21 d! < 1. Furthermore, we will assume that ✓bn ! ✓0 . Making the usual Taylor expansion we have d/2 (✓bn ✓0 ) = A 1 d/2 rLn (✓0 ) + op (1), where Z 2 A= [r✓ f (!; ✓0 )][r✓ f (!; ✓0 )]0 d!, (89) (2⇡)d Rd and it is clear the asymptotic sampling properties of ✓bn are determined by r✓ Ln (✓0 ), which we see P from (11) can be written as r✓ Ln (✓0 ) = Qa, ( 2r✓ f✓0 (·); 0) + 1d ak= a r✓ f✓0 (! k )2 . D Thus by using Theorem F.3 we have d/2 r✓ Ln (✓0 ) ! N (0, B), where Z 4 B = f (!)2 [r✓ f✓ (!)] [r✓ f✓ (!)]0 c✓=✓0 d! d (2⇡) Rd Z Z 4<G 2 2 f (!) r✓ f✓ (!)c✓=✓0 d! + 2|G | f (!)2 d! (2⇡)d Rd Rd P and G = n2 ak= a r✓ f✓ (! k )c✓=✓0 . Therefore, by using the above we have d/2 with d n ! 0 and log2 (a) 1/2 (✓bn P ✓0 ) ! N (0, A ! 0 as a ! 1 and ! 1. 86 1 BA 1 )

Fourier based statistics for irregular spaced spatial data

Related documents

Products

Support

Fourier based statistics for irregular spaced spatial data

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib