Available online at www.tjnsa.com J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 Research Article Nonparametric robust function estimation for integrated diffusion processes Yunyan Wang, Mingtian Tang∗ School of Science, Jiangxi University of Science and Technology, No. 86, Hongqi Ave., Ganzhou, 341000, Jiangxi, P. R. China. Communicated by R. Saadati Abstract This paper considers local M-estimation of the unknown drift and diffusion functions of integrated diffusion processes. We show that under appropriate conditions, the proposed estimators for drift and diffusion functions in the integrated process are consistent, and the conditions that ensure the asymptotic normality of these local M-estimators are also stated. The simulation studies show that the proposed c estimators perform better than the kernel estimator in robustness. 2016 All rights reserved. Keywords: Integrated diffusion process, local linear estimator, M-estimation, robust estimation. 2010 MSC: 62G35, 62G20. 1. Introduction Rt In this paper, we consider the stochastic process Yt = 0 Xs ds where X is a one-dimensional diffusion process given by dXt = µ(Xt )dt + σ(Xt )dBt , (1.1) where {Bt , t ≥ 0} is a standard Brownian motion (or Wiener process), and µ(·) and σ(·), the drift and diffusion of the process {Xt }, are functions only of the contemporaneous value of Xt . The integrated diffusion process (Yt , Xt ) solves the following second-order stochastic differential equation: ( dYt = Xt dt, (1.2) dXt = µ(Xt )dt + σ(Xt )dBt . ∗ Corresponding author Email addresses: yywang@zju.edu.cn (Yunyan Wang), mttang_csu@yahoo.com (Mingtian Tang) Received 2015-10-09 Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 3049 This integrated diffusion processes can model integrated and differentiated diffusion processes, and at the same time overcome the difficulties associated with the nondifferentiability of the Brownian motion, so these models play an important role in the financial and economic field. As we all know, the diffusion models (1.1) represent a widely accepted class of processes in finance and economics. Estimation and inference about infinitesimal mean function µ(·) and infinitesimal variance function σ(·) in (1.1) has been an important area in modern econometrics, especially the estimators proposed and studied based on a discrete time observation of the sample path. For example, one can view [3, 4] for parametric estimation and [18] for nonparametric estimation based on low-frequency data, as for the highfrequency data, one can view [2, 6, 14] and so on. However, all sample functions of a diffusion process (1.1) driven by a Brownian motion are of unbounded variation and nowhere differentiable, so it can not model integrated and differentiated diffusion processes. On the other hand, integrated diffusion processes play an important role in the field of finance, engineering and physics. For example, Xt may represent the velocity of a particle and Yt its coordinate, see e.g. [11, 23]. P. D. Ditlevsen and et al. in [10] used integrated diffusion process to model and analyze the ice-core data. Statistical inference for discretely observed integrated diffusion processes has been considered recently. For example, parametric inference for integrated diffusion processes has been addressed by [11, 15, 16, 17]. As for the nonparametric inferences, [22] obtained the Nadaraya-Waston estimators for drift and diffusion functions, [24] studied the local linear estimators for the two functions, and [25] developed the re-weighted estimator of the diffusion function. All the above nonparametric estimators are based on local polynomial regression smoothers. However, local polynomial regression smoothers are not robust, so there is a growing literature on the robust methods. One popular robust technique is the so-called M-estimator, which is the easiest to cope with as far as asymptotic theory is concerned as pointed out by [20]. Therefore, M-estimation has been studied by many authors such as [8, 9, 19] and references therein. Furthermore, some modified M-estimators were proposed, such as local M-estimator, which is a combination of the local linear smoothing technique and the Mestimation technique. The local M-estimators inherit the nice properties from not only M-estimators but also local linear estimators. For example, [13] proposed local linear M-estimator with variable bandwidth for regression function, [21] developed a robust estimator of the regression function by using local polynomial regression techniques. So in the present paper, we study robust nonparametric statistical inference for drift and diffusion functions of integrated diffusions. We wish to construct local M-estimators of the drift and diffusion coefficients in integrated diffusion processes (1.2) based on local linear smoothing technique and the M-estimation technique. The remainder of the paper is organized as follows. Section 2 introduces the local M-estimators of drift and diffusion coefficients in integrated diffusion model and develops the asymptotic results of the estimators under some mild conditions. Simulation studies are developed in Section 3. Some useful lemmas and all mathematical proofs are presented in Section 4. 2. Local M-estimator and Asymptotic Results 2.1. Local M-estimators Just as [22, 25] pointed out that there is a difficulty to estimate the integrated diffusion processes, i.e., the value of X in model (1.2) at time ti is impossible to obtain, and at the same time the estimation of model (1.2) can not be based on the observations {Yti , i = 1, 2, · · · }. Rt To simplify we use the notation ti = i∆ ,where ∆ = ti − ti−1 , by Yt = 0 Xu du, we have Yi∆ − Y(i−1)∆ 1 = ∆ ∆ Z i∆ Z Xu du − 0 (i−1)∆ ! Xu du 0 1 = ∆ Z i∆ Xu du, (i−1)∆ Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 Y 3050 −Y when ∆ tends to zero, the values of Xi∆ , X(i−1)∆ and i∆ ∆(i−1)∆ become nearly to each other, so our estimation procedure will be based on the following equivalent set of data, X̃i∆ = Yi∆ − Y(i−1)∆ . ∆ The estimation of drift and diffusion functions in the integrated diffusion model (1.2) depends on the following equations: ! X̃(i+1)∆ − X̃i∆ E (2.1) F(i−1)∆ = µ(X(i−1)∆ ) + o(1), ∆ → 0, ∆ ! 2 − X̃i∆ ) F(i−1)∆ = σ 2 (X(i−1)∆ ) + o(1), ∆ 3 2 (X̃(i+1)∆ E ∆ → 0, (2.2) where Ft = σ{Xs , s ≤ t}. Eqs. (2.1) and (2.2) can be obtained from Lemma 4.1 in Section 4, readers can also refer to [7] or [22] for more details about them. By (2.1), the local linear estimator for µ(x) is defined as the solution to the following problem: Choose a1 and b1 to minimize the following weighted sum !2 ! n X X̃(i−1)∆ − x X̃(i+1)∆ − X̃i∆ , − a1 − b1 (X̃i∆ − x) K ∆ h i=1 and by (2.2) the local linear estimator for σ 2 (x) is defined as the solution to the following problem: Choose a2 and b2 to minimize the following weighted sum !2 ! 2 n 3 X ( X̃ − X̃ ) X̃ − x i∆ (i+1)∆ (i−1)∆ 2 − a2 − b2 (X̃i∆ − x) K , ∆ h i=1 where K(·) is the kernel function and h = hn is bandwidth. However, the above local linear estimators are not robust. Therefore, let us choose a1 and b1 to minimize ! ! n X X̃(i+1)∆ − X̃i∆ X̃(i−1)∆ − x ρ1 − a1 − b1 (X̃i∆ − x) K ∆ h i=1 and a2 and b2 to minimize n X ρ2 3 2 (X̃(i+1)∆ ∆ i=1 2 − X̃i∆ ) ! X̃(i−1)∆ − x h − a2 − b2 (X̃i∆ − x) K ! , or to satisfy the following equations: n X i=1 ψ1 ! X̃(i+1)∆ − X̃i∆ − a1 − b1 (X̃i∆ − x) K ∆ X̃(i−1)∆ − x h ! 1 ! = X̃i∆ −x h 0 0 0 0 , (2.3) , (2.4) and n X i=1 ψ2 3 2 (X̃(i+1)∆ ∆ − X̃i∆ ) 2 ! − a2 − b2 (X̃i∆ − x) K X̃(i−1)∆ − x h ! 1 X̃i∆ −x h ! = where ρ1 (·) and ρ2 (·) are given functions and ψ1 (·) and ψ2 (·) are the derivatives of ρ1 (·) and ρ2 (·), respectively. The local M-estimators of µ(x) and µ0 (x) are denoted as µ̂(x) = â1 and µ̂0 (x) = b̂1 , which are the solutions of (2.3), the local M-estimators of σ 2 (x) and (σ 2 (x))0 are denoted as σ̂ 2 (x) = â2 and (σ̂ 2 (x))0 = b̂2 , which are the solutions of (2.4). Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 3051 2.2. Assumptions and Asymptotic Results Suppose x0 is a given point, the asymptotic results for our local M-estimators of drift and diffusion functions in integrated diffusion processes need the following conditions. (A1). ([22]) Rz (i) Let interval D = (l, r) be the state space of X, s(z) = exp{− z0 2µ(x) dx} is the scale density σ 2 (x) function (z0 is an arbitrary point inside D). For x ∈ D, l < x1 < x < x2 < r, let Z x Z x2 s(u)du = ∞, S[x, r) = lim s(u)du = ∞; S(l, x] = lim x1 →l (ii) Rr l x2 →r x1 x m(x)dx < ∞, where m(x) = (σ 2 (x)s(x))−1 is a speed density function; (iii) X0 = x has distribution P 0 , where P 0 is the invariant distribution of the ergodic process (Xt )t∈[0,∞) . (A2). Let interval D = (l, r) be the state space of X, we assume that µ(x) σ 0 (x) µ(x) σ 0 (x) lim sup − < 0, lim sup − > 0; σ(x) 2 σ(x) 2 x→r x→l Remark 2.1. (A1) assures that X is stationary ([1]), the stationary density of X denotes as p(x) in this paper. (A2) assures that the process X is α−mixing. Under assumptions (A1) and (A2), we know that {X̃i∆ , i = 0, 1, · · · } is stationary and α−mixing. Let α(k) be the mixing coefficient of process X, we assume that α(k) satisfy the following condition. P a (A3). The mixing coefficient α(k) of process X satisfies k (α(k))γ/(2+γ) < ∞ for some a > γ/(2 + γ), k≥1 where γ is given in assumption (A10). The kernel function K(·) and bandwidth h satisfy the following assumptions (A4) and (A5). (A4). The kernel K(·) is a continuously differentiable, symmetric density function compactly supported on [−1, 1]. (A5). ∆ → 0, h → 0 and nh2 → ∞ as n → ∞. Remark 2.2. To simplify our purpose, we consider at this stage only positive and symmetrical kernels which are the most classical ones. As for how to choose bandwidth the book of [12] is recommended. (A6). (i) µ(x) and σ(x) have continuous derivative of order 4 and satisfy |µ(x)| ≤ C(1 + |x|)λ and |σ(x)| ≤ C(1 + |x|)λ for some λ > 0; (ii) E[X0r ] < ∞, where r = max{4λ, 1 + 3λ, −1 + 5λ, −2 + 6λ}. Remark 2.3. Assumption (A6) guarantees that Lemma 4.1 can be used properly throughout the paper. (A7). The density function p(x) of the process X is continuous at x0 , and p(x0 ) > 0. Furthermore, the joint density of Xi∆ and Xj∆ are bounded for all i, j. (A8). (i) lim h1 E(|K 0 (ξni )|2 ) < ∞; h→0 (ii) lim h1 E(|K 0 (ξni )|4 ) < ∞, where h→0 ξni = θ((X(i−1)∆ − x)/h) + (1 − θ)((X̃(i−1)∆ − x)/h), 0 ≤ θ ≤ 1. Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 (A9). (i) E[ ψ1 (ui∆ )| X(i−1)∆ = x] = o(1) with ui∆ = (ii) E[ ψ2 (vi∆ )| X(i−1)∆ = x] = o(1) with vi∆ = (A10). X̃(i+1)∆ −X̃i∆ ∆ 3052 − µ(X(i−1)∆ ); 2 3 (X̃(i+1)∆ −X̃i∆ ) 2 ∆ − σ 2 (X(i−1)∆ ). (i) The function ψ1 (·) is continuous and has a derivative ψ10 (·) almost everywhere. Furthermore, it is assumed that functions 2 E[ ψ10 (ui∆ ) X(i−1)∆ = x] > 0, E[ ψ12 (ui∆ ) X(i−1)∆ = x] > 0, E[ ψ10 (ui∆ ) X(i−1)∆ = x] > 0 and continuous at the point x0 , and there exists a constant γ > 0 such that 2+γ E[ |ψ1 (ui∆ )|2+γ X(i−1)∆ = x], E[ ψ10 (ui∆ ) X(i−1)∆ = x] are bounded in a neighborhood of x0 ; (ii) The function ψ2 (·) is continuous and has a derivative ψ20 (·) almost everywhere. Furthermore, 0 (v )| X 2 (v ) X it is assumed that functions E[ ψ = x] > 0, E[ ψ (i−1)∆ (i−1)∆ = x] > 0, 2 i∆ 2 i∆ 2 E[ ψ20 (vi∆ ) X(i−1)∆ = x] > 0, and continuous at the point x0 , and there exists a constant γ > 0 such that E[ |ψ2 (vi∆ )|2+γ X(i−1)∆ = x], E[ |ψ20 (vi∆ )|2+γ X(i−1)∆ = x] are bounded in a neighborhood of x0 . (A11). (i) For any i, j, E[ ψ12 (ui∆ ) + ψ12 (uj∆ ) X(i−1)∆ = x, X(j−1)∆ = y], 2 2 E[ ψ10 (ui∆ ) + ψ10 (uj∆ ) X(i−1)∆ = x, X(j−1)∆ = y] are bounded in the neighborhood of x0 ; (ii) For any i, j, E[ ψ22 (vi∆ ) + ψ22 (vj∆ ) X(i−1)∆ = x, X(j−1)∆ = y], 2 2 E[ ψ20 (vi∆ ) + ψ20 (vj∆ ) X(i−1)∆ = x, X(j−1)∆ = y] are bounded in the neighborhood of x0 . (A12). (i) The function ψ10 (·) satisfies E[ sup ψ10 (ui∆ + z) − ψ10 (ui∆ ) X(i−1)∆ = x] = o(1), |z|≤δ E[ sup ψ1 (ui∆ + z) − ψ1 (ui∆ ) − ψ10 (ui∆ )z X(i−1)∆ = x] = o(δ), |z|≤δ as δ → 0 uniformly in x in a neighborhood of x0 ; (ii) The function ψ20 (·) satisfies E[ sup ψ20 (vi∆ + z) − ψ20 (vi∆ ) X(i−1)∆ = x] = o(1), |z|≤δ E[ sup ψ2 (vi∆ + z) − ψ2 (vi∆ ) − ψ20 (vi∆ )z X(i−1)∆ = x] = o(δ), |z|≤δ as δ → 0 uniformly in x in a neighborhood of x0 . Remark 2.4. The conditions in assumptions (A9)-(A12) imposed on ψ1 (·) and ψ2 (·) are mild and satisfied for many applications. Particularly, they are fulfilled for Huber’s ψ(·) function. In this paper, we do not need the monotonicity and boundedness of ψi (·), i = 1, 2 and the convexity of the function ρi (·), i = 1, 2. For more details about these conditions we refer to [5] or [13]. Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 3053 (A13). Assume that there exists a sequence of positive integers qn such that qn → ∞, qn = o((nh)1/2 ) and (n/h)1/2 α(qn ) → 0 as n → ∞. (A14). There exists τ > 2 + γ, where γ is given in Assumption 10, such that E{|ψ1 (ui∆ )|τ |X(i−1)∆ = x}, E{|ψ2 (vi∆ )|τ |X(i−1)∆ = x} are bounded for all x in a neighborhood of x0 , and α(n) = O(n−θ ), where θ ≥ (2 + γ)τ /{2(τ − 2 − γ)}. (A15). n−γ/4 h(2+γ)/τ −1−γ/4 = O(1), where γ and τ are given in assumptions (A10) and (A14), respectively. Remark 2.5. Obviously, Assumption 15 is automatically satisfied for γ ≥ 1 and it is also fulfilled for 0 < γ < 1 if τ satisfies γ < τ − 2 ≤ γ/(1 − γ). Throughout the whole paper, let Z Kl = K(u)ul du, Z Jl = K0 K1 U= , V = K1 K2 G1 (x) = E[ ψ10 (ui∆ ) X(i−1)∆ = x], H1 (x) = E[ ψ20 (vi∆ ) X(i−1)∆ = x], ul K 2 (u)du, J0 J1 J1 J2 for l ≥ 0. , A= K2 K3 , G2 (x) = E[ ψ12 (ui∆ ) X(i−1)∆ = x], H2 (x) = E[ ψ22 (vi∆ ) X(i−1)∆ = x]. Our main results are as follows: Theorem 2.6. Under assumptions (A1)-(A7) and the conditions (i) of the assumptions (A8)-(A12), there exist solutions µ̂(x0 ) and µ̂0 (x0 ) to equation (2.3) such that P µ̂(x0 ) − µ(x0 ) → 0, as n → ∞. (i) 0 0 h(µ̂ (x0 ) − µ (x0 )) (ii) Furthermore, if assumptions (A13)-(A15) hold, then √ h2 µ00 (x0 ) −1 D µ̂(x0 ) − µ(x0 ) − nh U A → N (0, Σ1 ), h(µ̂0 (x0 ) − µ0 (x0 )) 2 P D where “→” means convergence in probability, “→” means convergence in distribution, and Σ1 = G2 (x0 ) U −1 V 2 G1 (x0 )p(x0 ) U −1 . Theorem 2.7. Under assumptions (A1)-(A7) and the conditions (ii) of the assumptions (A8)-(A12), there exist solutions σ̂ 2 (x0 ) and (σ̂ 2 (x0 ))0 to equation (2.4) such that P σ̂ 2 (x0 ) − σ 2 (x0 ) → 0, as n → ∞. (i) h((σ̂ 2 (x0 ))0 − (σ 2 (x0 ))0 ) (ii) Furthermore, if assumptions (A13)-(A15) hold, then √ h2 (σ 2 (x0 ))00 −1 D σ̂ 2 (x0 ) − σ 2 (x0 ) nh − U A → N (0, Σ2 ), h[(σ̂ 2 (x0 ))0 − (σ 2 (x0 ))0 ] 2 P D where “→” means convergence in probability, “→” means convergence in distribution, and Σ2 = H2 (x0 ) U −1 V U −1 . H12 (x0 )p(x0 ) Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 3054 3. Simulations In this section, we perform a Monte Carlo experiment to show the performance of the local M-estimators discussed in the paper by comparing the mean square error (MSE) between the new estimators and the following kernel-type estimators for µ(x) and σ 2 (x)(see [22]): n P X̃(i−1)∆ −x (X̃(i+1)∆ −X̃i∆ ) K h ∆ µ̄(x) = i=1 , n P X̃(i−1)∆ −x K h i=1 n P σ̄ 2 (x) = K i=1 X̃(i−1)∆ −x h n P i=1 K 2 3 (X̃(i+1)∆ −X̃i∆ ) 2 X̃(i−1)∆ −x h ∆ . Rt Our experiment is based on the simulation of Yt = Y0 + 0 Xu du, where X is an ergodic process governed by the following stochastic differential equation: q dXt = −10Xt dt + 0.1 + 0.1Xt2 dBt , (3.1) and we use the Euler-Maruyama method to approximate the numerical solution of our stochastic differential equation. Throughout we take Huber’s function ψ1 (z) = max{−c, min(c, z)} with c = 0.135 and we smooth using a Gauss kernel, and the bandwidth used in our simulation is h = hopt , where hopt is the optimal bandwidth, which minimize the mean square error (MSE): n 1X (µ̂(xi ) − µ(xi ))2 , n i=1 where {xi , i = 1, 2, · · · , n} are chosen uniformly to cover the range of sample path of Xt , and we obtain µ̂(·) by iteration since it has no explicit expression. For any initial value µ̂0 (x), we have µ̂t−1 (x) µ̂t (x) − [Wn (µ̂t−1 (x), µ̂0t−1 (x))]−1 Ψn (µ̂t−1 (x), µ̂0t−1 (x)), = µ̂0t−1 (x) µ̂0t (x) where µ̂t−1 (x) and µ̂0t−1 (x) are the tth iteration value of µ̂0 (x) and µ̂(x), and (a1 ,b1 ) ∂Ψn (a1 ,b1 ) Wn (a1 , b1 ) = ∂Ψn∂a , , ∂b1 1 ! ! n X X̃(i−1)∆ − x X̃(i+1)∆ − X̃i∆ Ψn (a1 , b1 ) = ψ1 − a1 − b1 (X̃i∆ − x) K ∆ h i=1 1 X̃i∆ −x h ! . This procedure terminates when µ̂t (x) µ̂t−1 (x) ≤ 1 × 10−4 . − 0 µ̂0 (x) µ̂ (x) t t−1 In order to illustrate continuous time integrated processes, Figure 1 presents a simulated path of X which Rt is defined by (3.1) and Figure 2 presents a simulated path of Yt = Y0 + 0 Xu du where t ∈ [0, T ] = [0, 10] and X is governed by (3.1). In order to compare the robustness of the local M-estimators with that of the Nadaraya-Watson estimators, Table 1 and Table 2 show the performance of the local M-estimator and the Nadaraya-Watson estimator for drift function µ(·) and diffusion function σ 2 (·) in terms of the MSE, respectively. We can see that the local M-estimator performs better than the Nadaraya-Watson estimator, and the performances of the both estimators are improved as the sample size increases. Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 3055 0.3 100.1 0.2 100.08 100.06 0.1 100.02 Yt Xt 100.04 0 −0.1 100 −0.2 99.98 −0.3 99.96 −0.4 99.94 0 1 2 3 4 5 6 t 7 8 9 10 0 1 2 3 4 t 5 6 7 8 9 10 Figure 1: THE SAMPLE PATH OF THE PROCESS Xt . Figure 2: THE SAMPLE PATH OF THE INTEGRATED PROCESS Yt . Table 1: The mean square error of Nadaraya-Watson estimator (MSE11 ) and local M-estimator (MSE12 ) for drift function µ(·). Table 2: The mean square error of Nadaraya-Watson estimator (MSE21 ) and local M-estimator (MSE22 ) for diffusion function σ 2 (·). Sample size n n = 100 n = 200 n = 400 n = 800 MSE11 0.9084 0.7499 0.5754 0.3742 MSE12 0.1652 0.1214 0.1272 0.1171 Sample size n n = 100 n = 200 n = 400 n = 800 MSE21 0.0988 0.0891 0.0881 0.0862 MSE22 0.0459 0.0458 0.0390 0.0392 4. Lemmas and Proofs In order to prove Theorem 2.6 and Theorem 2.7 , we need the following lemmas. Lemma 4.1 ([22]). Let Z be a d-dimensional diffusion process governed by the stochastic integral equation t Z Zt = Z0 + Z µ(Zs )ds + 0 t σ(Zs )dBs , 0 where µ(z) = [µi (z)]d×1 is a d × 1 vector, σ(z) = [σij (z)]d×d is a d × d diagonal matrix, and Bt is a d × 1 vector of independent Brownian motions. Assume that µ and σ have continuous partial derivatives of order 2s. Let f (z) be a continuous function defined on Rd with values in Rd and with continuous partial derivative of order 2s + 2. Then s X ∆k E[f (Zi∆ )|Z(i−1)∆ ] = Lk f (Z(i−1)∆ ) + R, k! k=0 where L is a second-order differential operator defined as L= d X µi (z) i=1 and Z i∆ Z u1 ∂ 1 2 ∂2 ∂2 ∂2 2 2 + (σ11 (z) 2 + σ22 (z) 2 + ... + σdd (z) 2 ), ∂zi 2 ∂z1 ∂z2 ∂zd Z u2 Z us ··· R= (i−1)∆ (i−1)∆ (i−1)∆ E[ Ls+1 f (Zus+1 ) Z(i−1)∆ ]dus+1 dus · · · du1 (i−1)∆ is a stochastic function of order ∆s+1 . Lemma 4.2 ([22]). Let ξni = θ((X(i−1)∆ − x)/h) + (1 − θ)((X̃(i−1)∆ − x)/h), 0 ≤ θ ≤ 1, Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 and n ε1,n ! X̃(i−1)∆ − x g(X̃(i−1)∆ , X̃i∆ ), h 1 X = K nh i=1 n 1 X K nh ε2,n = i=1 3056 X(i−1)∆ − x g(X̃(i−1)∆ , X̃i∆ ), h where g is a measurable function on R × R. Assume that Assumption 1, Assumption 4, and one of the following two conditions holds, √ ∆/h → 0. If (i) E[(g(X̃(i−1)∆ , X̃i∆ ))2 ] < ∞ and Assumption 8 (ii). (ii) h−1 E[|(X̃(i−1)∆ − X(i−1)∆ )g(X̃(i−1)∆ , X̃i∆ )|2 ] < ∞ and Assumption 8 (i). Then P |ε1n − ε2n | → 0. Lemma 4.3. Under Assumptions 1-7 and the conditions (i) of the Assumptions 8-12, for any random sequence {ηj }nj=1 , if max |ηj | = op (1), we have 1≤j≤n (i) n P i=1 (ii) n P i=1 ψ10 (ui∆ + ηi∆ )K ψ10 (ui∆ X̃(i−1)∆ −x0 h + ηi∆ )R1 (Xi∆ )K (X̃i∆ − x0 )l = nhl+1 G1 (x0 )p(x0 )Kl (1 + op (1)), X̃(i−1)∆ −x0 h 0 ) 00 (X̃i∆ − x0 )l = nhl+3 G1 (x µ (x0 )p(x0 )Kl+2 (1 + op (1)), 2 where R1 (X̃i∆ ) = µ(X̃i∆ ) − µ(x0 ) − µ0 (x0 )(X̃i∆ − x0 ). Proof. (i) Obviously, we have n X + ηi∆ )K ψ10 (ui∆ )K X̃(i−1)∆ − x0 h i=1 = n X i=1 + ! X̃(i−1)∆ − x0 h ψ10 (ui∆ n X (X̃i∆ − x0 )l ! (X̃i∆ − x0 )l [ψ10 (ui∆ + ηi∆ ) − ψ10 (ui∆ )]K i=1 X̃(i−1)∆ − x0 h ! (X̃i∆ − x0 )l := Tn1 + Tn2 . For Tn1 , by Lemma 4.2, we need to consider " n # X X(i−1)∆ − x0 l 0 E ψ1 (ui∆ )K (X̃i∆ − x0 ) h i=1 " n # X X(i−1)∆ − x0 l 0 =E K (X̃i∆ − x0 ) E[ψ1 (ui∆ ) X(i−1)∆ = x0 ] h i=1 " n # X X(i−1)∆ − x0 l = G1 (x0 )E K (X̃i∆ − x0 ) . h i=1 Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 3057 Next, we will show that # " n X X(i−1)∆ − x0 l (X̃i∆ − x0 ) = nhl+1 p(x0 )Kl (1 + o(1)). E K h i=1 In the same lines of arguments as in Lemma 1 of [5], we have # " n X X(i−1)∆ − x0 E (X(i−1)∆ − x0 )l = nhl+1 p(x0 )Kl (1 + o(1)), K h i=1 so it suffices to show that n n X(i−1)∆ − x0 X(i−1)∆ − x0 1 X 1 X P l (X̃i∆ − x0 ) − (X(i−1)∆ − x0 )l → 0. K K l+1 l+1 h h nh nh i=1 i=1 For (4.1), let δ1n = n X(i−1)∆ − x0 1 X l K [(X̃i∆ − x0 ) − (X(i−1)∆ − x0 )l ], h nhl+1 i=1 it suffices to show that lim E(δ1n ) = 0, n→∞ lim V ar(δ1n ) = 0. n→∞ By the stationarity and Lemma 4.1, we have " # n X(i−1)∆ − x0 1 X l E(δ1n ) =E ((X̃i∆ − x0 ) − (X(i−1)∆ − x0 )l ) K h nhl+1 i=1 X(i−1)∆ − x0 1 l l =E l+1 K ((X̃i∆ − x0 ) − (X(i−1)∆ − x0 ) ) h h X(i−1)∆ − x0 1 l−1 l(X(i−1)∆ − x0 ) (X̃i∆ − X(i−1)∆ ) + o(1) =E l+1 K h h X(i−1)∆ − x0 1 l−1 =E l+1 K l(X(i−1)∆ − x0 ) E[(X̃i∆ − X(i−1)∆ ) X(i−1)∆ ] + o(1) h h " # X(i−1)∆ − x0 X(i−1)∆ − x0 l−1 ∆ 1 = E K l µ(X(i−1)∆ ) + O(∆2 ), 2h h h h which implies that lim E(δ1n ) = 0. n→∞ On the other hand, " # n X(i−1)∆ − x0 1 X l l V ar(δ1n ) = V ar K ((X̃i∆ − x0 ) − (X(i−1)∆ − x0 ) ) h nhl+1 i=1 " # n X(i−1)∆ − x0 X(i−1)∆ − x0 l−1 1 X1 1 V ar √ K l (X̃i∆ − X(i−1)∆ ) + o(1) = nh2 h h h n i=1 " # n 1 1 X = V ar √ fil + o(1), nh2 n i=1 X(i−1)∆ −x0 X(i−1)∆ −x0 l−1 l (X̃i∆ − X(i−1)∆ ), and we have where fil = h1 K h h # n n n n−1 1 X 1X 2 X X V ar √ fil = V ar(fil ) + Cov(fil , fjl ). n n n " i=1 i=1 i=j+1 j=1 (4.1) Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 Next we will show that V ar √1 n n P 3058 fil < ∞, with the same arguments as [23], we only need to show i=1 E(fil2 ) < ∞. In fact, E(fil2 ) ! 2l−2 2 2 X(i−1)∆ − x0 l (X̃i∆ − X(i−1)∆ ) h ! 1 2 X(i−1)∆ − x0 2 X(i−1)∆ − x0 2l−2 2 l K E[(X̃i∆ − X(i−1)∆ ) X(i−1)∆ ] , h2 h h 1 2 K h2 =E =E X(i−1)∆ − x0 h and 2 2 X(i−1)∆ ] . X(i−1)∆ ] − 2E[X̃i∆ X(i−1)∆ X(i−1)∆ ] + E[X(i−1)∆ E[(X̃i∆ − X(i−1)∆ )2 X(i−1)∆ ] = E[X̃i∆ By Lemma 4.1, for the first term, we have 2 E[X̃i∆ X(i−1)∆ ] =E (Yi∆ − Y(i−1)∆ )2 X(i−1)∆ ∆2 ! 1 2 + σ 2 (X(i−1)∆ )∆+X(i−1)∆ µ(X(i−1)∆ )∆+R1 , = X(i−1)∆ 3 where R1 is a stochastic function of order ∆2 . For the second term, we have Yi∆ − Y(i−1)∆ 1 2 E[X̃i∆ X(i−1)∆ X(i−1)∆ ] = E X(i−1)∆ X(i−1)∆ = X(i−1)∆ + X(i−1)∆ µ(X(i−1)∆ )∆ + R2 , ∆ 2 where R2 is a stochastic function of order ∆2 . For the third term, we have 2 X(i−1)∆ ] = X 2 E[X(i−1)∆ (i−1)∆ . Then we have 1 E[(X̃i∆ − X(i−1)∆ )2 X(i−1)∆ ] = σ 2 (X(i−1)∆ )∆ + R, 3 where R = R1 − 2R2 . So we have E(fil2 ) < ∞, and lim V ar(δ1n ) = 0 is immediate from nh2 → ∞. n→∞ For Tn2 , by Assumption 5 and Assumption 12(i), with the similar arguments as in Lemma 5.1 of [13], we can get Tn2 = op (nhl+1 ). This completes the proof of lemma. (ii) This part can be proved by the same arguments with the first part of this lemma, so we omit the details. Lemma 4.4. Under Assumptions 1-7 and the conditions (ii) of the Assumptions 8-12, for any random sequence {ηj }nj=1 , if max |ηj | = op (1), we have 1≤j≤n (i) n P i=1 (ii) n P i=1 ψ20 (vi∆ ψ20 (vi∆ + ηi∆ )K X̃(i−1)∆ −x0 h + ηi∆ )R2 (Xi∆ )K X̃i∆ − x0 )l = nhl+1 H1 (x0 )p(x0 )Kl (1 + op (1)), X̃(i−1)∆ −x0 h 0) (X̃i∆ − x0 )l = nhl+3 H1 (x p(x0 )(σ 2 (x0 ))00 Kl+2 (1 + op (1)), 2 where R2 (X̃i∆ ) = σ 2 (X̃i∆ ) − σ 2 (x0 ) − (σ 2 (x0 ))0 (X̃i∆ − x0 ). Proof. The proof of this lemma is similar to Lemma 4.3, so we omit the details. The proofs of the following two lemmas are similar to Theorem 1 in [5], so we omit the details. Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 3059 Lemma 4.5. Under Assumptions 1-7, the conditions (i) of the Assumptions 8-11 and Assumptions 13-15, we have n P X̃(i−1)∆ −x0 ψ1 (ui∆ )K h D 1 i=1 → N (0, Σ3 ), √ n P X̃(i−1)∆ −x0 X̃i∆ −x0 nh ψ (u )K 1 i∆ h i=1 h where Σ3 = G2 (x0 )p(x0 )V. Lemma 4.6. Under Assumptions 1-7, the conditions (ii) of the Assumptions 8-11 and Assumptions 13-15, we have n P X̃(i−1)∆ −x0 ψ2 (vi∆ )K h D 1 i=1 → N (0, Σ4 ), √ n P X̃(i−1)∆ −x0 X̃i∆ −x0 nh ψ2 (vi∆ )K h h i=1 where Σ4 = H2 (x0 )p(x0 )V. The proof of Theorem 2.6. (i) We prove the consistency of the local M-estimators of µ(x) and µ0 (x). Let ! 1 r = (a1 , hb1 )T , r0 = (µ(x0 ), hµ0 (x0 ))T , ri∆ = (r − r0 )T , X̃i∆ −x0 h and Ln (r) = n X i=1 ρ1 ! X̃(i+1)∆ − X̃i∆ − a1 − b1 (X̃i∆ − x0 ) K ∆ X̃(i−1)∆ − x0 h ! . Then we have ri∆ = (r − r0 )T ! 1 X̃i∆ −x0 h = (a1 − µ(x0 ), hb1 − hµ0 (x0 )) ! 1 X̃i∆ −x0 h X̃i∆ − x0 h = a1 − µ(x0 ) + (b1 − µ0 (x0 ))(X̃i∆ − x0 ) = a1 − µ(x0 ) + (hb1 − hµ0 (x0 )) = a1 + b1 (X̃i∆ − x0 ) − µ(x0 ) − µ0 (x0 )(X̃i∆ − x0 ) = a1 + b1 (X̃i∆ − x0 ) + R1 (X̃i∆ ) − µ(X̃i∆ ) = a1 + b1 (X̃i∆ − x0 ) + R1 (X̃i∆ ) − X̃(i+1)∆ − X̃i∆ − ui∆ ∆ ! . Let Sδ be the circle centered at r0 with radius δ. We will show that for any sufficiently small δ, lim P { inf Ln (r) > Ln (r0 )} = 1. n→∞ (4.2) r∈Sδ In fact, for r ∈ Sδ , we have Ln (r) − Ln (r0 ) = n X i=1 ρ1 ! X̃(i+1)∆ − X̃i∆ − a1 − b1 (X̃i∆ − x0 ) K ∆ X̃(i−1)∆ − x0 h ! Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 3060 n X ! ! X̃(i+1)∆ − X̃i∆ X̃(i−1)∆ − x0 0 − − µ(x0 ) − µ (x0 )(X̃i∆ − x0 ) K ρ1 ∆ h i=1 ! n X X̃(i−1)∆ − x0 [ρ1 (ui∆ + R1 (X̃i∆ ) − ri∆ ) − ρ1 (ui∆ + R1 (X̃i∆ ))] = K h i=1 !Z n ui∆ +R1 (X̃i∆ )−ri∆ X X̃(i−1)∆ − x0 = K ψ1 (t)dt h ui∆ +R1 (X̃i∆ ) i=1 !Z n ui∆ +R1 (X̃i∆ )−ri∆ X X̃(i−1)∆ − x0 = ψ1 (ui∆ )dt K h u +R ( X̃ ) 1 i∆ i∆ i=1 !Z n ui∆ +R1 (X̃i∆ )−ri∆ X X̃(i−1)∆ − x0 + ψ10 (ui∆ )(t − ui∆ )dt K h u +R ( X̃ ) 1 i∆ i∆ i=1 !Z n u +R ( X̃ 1 X i∆ i∆ )−ri∆ X̃(i−1)∆ − x0 + [ψ1 (t) − ψ1 (ui∆ ) − ψ10 (ui∆ )(t − ui∆ )]dt K h ui∆ +R1 (X̃i∆ ) i=1 :=Ln1 + Ln2 + Ln3 . Next, we will show that Ln1 = op (nhδ), Ln2 = (4.3) nh (r − r0 )T G1 (x0 )p(x0 )U (1 + op (1))(r − r0 ) + Op (nh3 δ), 2 Ln3 = op (nhδ 2 ). (4.5) For (4.3), we have Ln1 = n X i=1 K X̃(i−1)∆ − x0 h !Z ui∆ +R1 (X̃i∆ )−ri∆ ψ1 (ui∆ )dt ui∆ +R1 (X̃i∆ ) n X ! X̃(i−1)∆ − x0 = K ψ1 (ui∆ )(−ri∆ ) h i=1 ! n X X̃ − x 0 (i−1)∆ = −(r − r0 )T K ψ1 (ui∆ ) h i=1 1 X̃i∆ −x0 h = −(r − r0 )T Wn , where n P X̃(i−1)∆ −x0 ψ (u )K 1 i∆ h i=1 Wn = n P X̃(i−1)∆ −x0 X̃i∆ −x0 ψ1 (ui∆ )K h h i=1 By Lemma 4.5, we have E(Wn ) = o(1), and V ar(Wn ) = nhG2 (x)p(x0 )V (1 + o(1)). Note that Wn = E(Wn ) + Op p V ar(Wn ) , (4.4) . ! Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 √ so we have Wn = Op ( nh), which implies that (4.3) holds. For (4.4), we have !Z n ui∆ +R1 (X̃i∆ )−ri∆ X X̃(i−1)∆ − x0 Ln2 = [ψ10 (ui∆ )(t − ui∆ )]dt K h ui∆ +R1 (X̃i∆ ) i=1 ! n X̃(i−1)∆ − x0 1X 2 ψ10 (ui∆ )(ri∆ − 2R1 (X̃i∆ )ri∆ ) = K 2 h i=1 ! ! n X̃i∆ −x0 X X̃(i−1)∆ − x0 1 1 h = ψ10 (ui∆ )(r − r0 )T K (r − r0 ) 2 (X̃i∆ −x0 ) X̃i∆ −x0 2 h 2 i=1 h h ! n X X̃(i−1)∆ − x0 ψ10 (ui∆ )R1 (X̃i∆ )ri∆ − K h i=1 := Ln21 + Ln22 . From Lemma 4.3 with l = 0, l = 1 and l = 2, respectively, we have ! ! n X̃i∆ −x0 X̃(i−1)∆ − x0 1 1X h Ln21 = K ψ10 (ui∆ )(r − r0 )T (r − r0 ) 2 (X̃i∆ −x0 ) X̃i∆ −x0 2 h i=1 h h2 nh K0 K1 T (r − r0 ) G1 (x0 )p(x0 ) (1 + op (1))(r − r0 ) = K1 K2 2 nh (r − r0 )T G1 (x0 )p(x0 )U (1 + op (1))(r − r0 ) = 2 and Ln22 n X ! X̃(i−1)∆ − x0 =− ψ10 (ui∆ )R1 (X̃i∆ )ri∆ K h i=1 ! ! n X X̃ − x 1 0 (i−1)∆ = −(r − r0 )T K ψ10 (ui∆ )R1 (X̃i∆ ) X̃i∆ −x0 h h i=1 3 nh K2 =− (r − r0 )T G1 (x0 )µ00 (x0 )p(x0 ) (1 + op (1)) K3 2 = Op (nh3 δ). Therefore Ln2 = Ln21 + Ln22 = nh (r − r0 )T G1 (x0 )p(x0 )U (1 + op (1))(r − r0 ) + Op (nh3 δ). 2 For (4.5), we have Ln3 = = n X i=1 n X i=1 K K X̃(i−1)∆ − x0 h !Z X̃(i−1)∆ − x0 h !Z ui∆ +R1 (X̃i∆ )−ri∆ [ψ1 (t) − ψ1 (ui∆ ) − ψ10 (ui∆ )(t − ui∆ )]dt ui∆ +R1 (X̃i∆ ) R1 (X̃i∆ )−ri∆ R1 (X̃i∆ ) [ψ1 (t + ui∆ ) − ψ1 (ui∆ ) − ψ10 (ui∆ )t]dt 3061 Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 3062 n X ! X̃(i−1)∆ − x0 [ψ1 (zi∆ + ui∆ ) − ψ1 (ui∆ ) − ψ10 (ui∆ )zi∆ ](−ri∆ ) = K h i=1 ! n X X̃ − x 0 (i−1)∆ [ψ1 (zi∆ + ui∆ ) − ψ1 (ui∆ ) − ψ10 (ui∆ )zi∆ ] = −(r − r0 )T K h i=1 1 X̃i∆ −x0 h ! , where the second-to-last equality follows from the integral mean value theorem and zi∆ lies between R1 (X̃i∆ ) and R1 (X̃i∆ ) − ri∆ , for i = 1, 2, · · · , n. Since X̃i∆ − x0 ≤ h, we have max |zi∆ | ≤ max R1 (X̃i∆ ) + (r − r0 )T i i 1 X̃i∆ −x0 h ! ≤ max R1 (X̃i∆ ) + 2δ, i (4.6) and by Taylor’s expansion, 1 00 2 0 max R1 (X̃i∆ ) = max µ(X̃i∆ ) − µ(x0 ) − µ (x0 )(X̃i∆ − x0 ) = max µ (ξi )(X̃i∆ − x0 ) ≤ Op (h2 ), (4.7) i i i 2 where ξi lies between X̃i∆ and x0 , for i = 1, 2, · · · , n. T For any given η > 0, let Dη = {(δ1∆ , δ2∆ , · · · , δn∆ ) : |δi∆ | ≤ η, ∀i ≤ n}, by Assumption 12(i) and X̃i∆ − x0 ≤ h, we have # ! n X X̃ − x l 0 (i−1)∆ (X̃i∆ − x0 ) E sup [ψ1 (δi∆ + ui∆ ) − ψ1 (ui∆ ) − ψ10 (ui∆ )δi∆ ]K h Dη i=1 ! # " n l X X̃(i−1)∆ − x0 0 ≤E sup ψ1 (δi∆ + ui∆ ) − ψ1 (ui∆ ) − ψ1 (ui∆ )δi∆ K X̃i∆ − x0 h i=1 Dη " n ! # l X X̃(i−1)∆ − x0 ≤ aη δE K X̃i∆ − x0 h " i=1 ≤ bη δ, where aη and bη are two sequences of positive numbers, tending to zero as η → 0. Therefore by (4.6) and (4.7), we have ! n X X̃(i−1)∆ − x0 0 [ψ1 (zi∆ + ui∆ ) − ψ1 (ui∆ ) − ψ1 (ui∆ )zi∆ ]K (X̃i∆ − x0 )l = op (nhl+1 δ), h i=1 which implies that (4.5) holds. Let λ be the smallest eigenvalue of the positive definite matrix U . Then, for any r ∈ Sδ , we have Ln (r) − Ln (r0 ) = Ln1 + Ln2 + Ln3 nh G1 (x0 )p(x0 )(r − r0 )T U (r − r0 )(1 + op (1)) + Op (nh3 δ) = 2 nh ≥ G1 (x0 )p(x0 )λδ 2 (1 + op (1)) + Op (nh3 δ). 2 So as n → ∞, we have P nh 2 inf Ln (r) − Ln (r0 ) > G1 (x0 )p(x0 )λδ > 0 → 1, r∈Sδ 2 Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 3063 which implies that (4.2) holds. From (4.2), we know that Ln (r) has a local minimum in the interior of Sδ , so there exists solutions to equation (2.3). Let (µ̂(x0 ), hµ̂0 (x0 ))T be the closest solutions to r0 = (µ(x0 ), hµ0 (x0 ))T , then n o 2 lim P (µ̂(x0 ) − µ(x0 ))2 + h2 (µ̂0 (x0 ) − µ0 (x0 )) ≤ δ 2 = 1, n→∞ which implies the consistency of the local M-estimators of µ(x) and µ0 (x). (ii) We prove the asymptotic normality of the local M-estimators of µ(x) and µ0 (x). Let η̂i∆ = R1 (X̃i∆ ) − (µ̂(x0 ) − µ(x0 )) − (µ̂0 (x0 ) − µ0 (x0 ))(X̃i∆ − x0 ). (4.8) Then we have X̃(i+1)∆ − X̃i∆ = µ(X̃i∆ ) + ui∆ ∆ = ui∆ + µ(X̃i∆ ) − µ(x0 ) − µ0 (x0 )(X̃i∆ − x0 ) + µ(x0 ) + µ0 (x0 )(X̃i∆ − x0 ) = ui∆ + R1 (X̃i∆ ) + µ̂(x0 ) + µ̂0 (x0 )(X̃i∆ − x0 ) + η̂i∆ − R1 (X̃i∆ ) = µ̂(x0 ) + µ̂0 (x0 )(X̃i∆ − x0 ) + ui∆ + η̂i∆ . Therefore by (2.3), we have n X ψ1 (ui∆ + η̂i∆ )K i=1 Let Tn1 = n X ψ1 (ui∆ )K i=1 Tn2 = n X i=1 Tn3 = n X i=1 X̃(i−1)∆ − x0 h X̃(i−1)∆ − x0 h ψ10 (ui∆ )η̂i∆ K ! [ψ1 (ui∆ + η̂i∆ ) − ψ1 (ui∆ ) − ψ10 (ui∆ )η̂i∆ ]K = 0. X̃i∆ −x h ! X̃(i−1)∆ − x0 h ! 1 ! 1 = Wn , X̃i∆ −x h ! (4.9) 1 ! X̃i∆ −x h X̃(i−1)∆ − x0 h , ! 1 X̃i∆ −x h ! . Then by (4.9), we have Tn1 + Tn2 + Tn3 = 0. And by (4.8), we have ! ! n X X̃(i−1)∆ − x0 1 0 Tn2 = ψ1 (ui∆ )R1 (X̃i∆ )K X̃i∆ −x h h i=1 ! ! n 0 (x ) − µ0 (x ))(X̃ X X̃ − x (µ̂(x ) − µ(x )) + (µ̂ − x ) 0 (i−1)∆ 0 0 0 0 0 i∆ − ψ10 (ui∆ )K X̃i∆ −x h [(µ̂(x0 ) − µ(x0 )) + (µ̂0 (x0 ) − µ0 (x0 ))(X̃i∆ − x0 )] h i=1 ! ! n X X̃(i−1)∆ − x0 1 0 = ψ1 (ui∆ )R1 (X̃i∆ )K X̃i∆ −x h h i=1 ! ! n X̃i∆ −x X X̃ − x 1 0 µ̂(x0 ) − µ(x0 ) (i−1)∆ 0 h − ψ1 (ui∆ )K 2 (X̃i∆ −x) X̃i∆ −x h(µ̂0 (x0 ) − µ0 (x0 )) h i=1 h h2 nh3 K2 = G1 (x0 )µ00 (x0 )p(x0 ) (1 + op (1)) K3 2 Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 − nhG1 (x0 )p(x0 ) K0 K1 K1 K2 (1 + op (1)) 3064 µ̂(x0 ) − µ(x0 ) h(µ̂0 (x0 ) − µ0 (x0 )) nh3 G1 (x0 )µ00 (x0 )p(x0 ) = A(1 + op (1)) − nhG1 (x0 )p(x0 )U (1 + op (1)) 2 µ̂(x0 ) − µ(x0 ) h(µ̂0 (x0 ) − µ0 (x0 )) :=Tn21 + Tn22 , where the third equality follows from Lemma 4.3. Noting that for X̃i∆ − x0 ≤ h, we have 0 0 sup |η̂i∆ | = sup R1 (X̃i∆ ) − (µ̂(x0 ) − µ(x0 )) − (µ̂ (x0 ) − µ (x0 ))(X̃i∆ − x0 ) i i ≤ sup R1 (X̃i∆ ) + |µ̂(x0 ) − µ(x0 )| + h µ̂0 (x0 ) − µ0 (x0 ) i = Op (h2 + (µ̂(x0 ) − µ(x0 )) + h(µ̂0 (x0 ) − µ0 (x0 ))) = op (1), where the last equality follows from the consistency of (µ̂(x0 ), hµ̂0 (x0 )). Then, by the Assumption 12(i) and the same argument as that in the first part of Theorem 2.6, we have ! ! n X X̃(i−1)∆ − x0 1 0 Tn3 = [ψ1 (ui∆ + η̂i∆ ) − ψ1 (ui∆ ) − ψ1 (ui∆ )η̂i∆ ]K X̃i∆ −x h h i=1 = op (nh)[h2 + (µ̂(x0 ) − µ(x0 )) + h(µ̂0 (x0 ) − µ0 (x0 ))] = op (Tn22 ). Therefore, by Tn1 + Tn2 + Tn3 = 0, we have 1 −1 h2 00 µ̂(x0 ) − µ(x0 ) −1 −1 = (x )p (x )U (1 + o (1))W + G µ (x0 )U −1 A(1 + op (1)). 0 0 p n h(µ̂0 (x0 ) − µ0 (x0 )) nh 1 2 It follows that √ h2 µ00 (x0 ) −1 1 µ̂(x0 ) − µ(x0 ) − (x0 )p−1 (x0 )U −1 (1 + op (1)) √ Wn . nh U A(1 + op (1)) = G−1 0 0 1 h(µ̂ (x0 ) − µ (x0 )) 2 nh So by Lemma 4.5, Assumption 5 and the Slutsky’s theorem, we have √ D µ̂(x0 ) − µ(x0 ) h2 µ00 (x0 ) −1 −1 −1 N (0, Σ ) nh − U A → G−1 3 0 0 1 (x0 )p (x0 )U 2 h(µ̂ (x0 ) − µ (x0 )) 2 (x0 ) = N 0, G2G U −1 V U −1 = N (0, Σ1 ). (x )p(x ) 1 0 0 This completes the proof. The proof of Theorem 2.7. The proof follows from Theorem 2.6 using Lemmas 4.4 and 4.6, and is omitted here. Acknowledgements This research is supported by the NSFC (No.11461032 and No.11401267) and the SF of Jiangxi Provincial Education Department (No. GJJ150687). Y. Wang, M. Tang, J. Nonlinear Sci. Appl. 9 (2016), 3048–3065 3065 References [1] L. Arnold, Stochastic differential equations: Theory and Application, Wiley, New York, (1974). 2.1 [2] F. Bandi, P. Phillips, Fully nonparametric estimation of scalar diffusion models, Econometrica, 71 (2003), 241– 283. 1 [3] B. M. Bibby, M. Jacobsen, M. Sørensen, Estimating functions for discretely sampled diffusion-type models, In: Handbook of Financial Econometrics, Amsterdam: North-Holland, (2002). 1 [4] B. M. Bibby, M. Sørensen, Martingale estimation functions for discretely observed diffusion processes, Bernoulli., 1 (1995), 17–39. 1 [5] Z. Cai, E. Ould-Said, Local M-estimator for nonparametric time series, Statist. Probab. Lett., 65 (2003), 433–449. 2.4, 4, 4 [6] F. Comte, V. Genon-Catalot, Y. Rozenholc, Penalized nonparametric mean square estimation of coefficients of diffusion processes, Bernoulli., 13 (2007), 514–543. 1 [7] F. Comte, V. Genon-Catalot, Y. Rozenholc, Nonparametric adaptive estimation for integrated diffusions, Stoch. Proc. Appl., 119 (2009), 811–834. 2.1 [8] R. A. Davis, K. Knight, J. Liuc, M-estimation for autoregressions with infinite variance, Stoch. Proc. Appl., 40 (1992), 145–180. 1 [9] M. Delecroix, M. Hristache, V. Patilea, On semiparametric M-estimation in single-index regression, J. Statist. Plann. Inference, 136 (2006), 730–769. 1 [10] P. D. Ditlevsen, S. Ditlevsen, K. K. Andersen, The fast climate fluctuations during the stadial and interstadial climate states, Ann. Glaciol., 35 (2002), 457–462. 1 [11] S. Ditlevsen, M. Sørensen, Inference for observations of integrated diffusion processes, Scand. J. Statist., 31 (2004), 417–429. 1 [12] J. Fan, I. Gijbels, Local polynomial modeling and its applications, Chapman & Hall, London, (1996). 2.2 [13] J. Fan, J. Jiang, Variable bandwidth and one-step local M-estimator, Sci. China Ser. A, 43 (2000), 65–81.1, 2.4, 4 [14] J. Fan, C. Zhang, A re-examination of diffusion estimations with applications to financial model validation, J. Amer. Statist. Assoc., 98 (2003), 118–134. 1 [15] A. Gloter, Discrete sampling of an integrated diffusion process and parameter estimation of the diffusion coefficient, ESAIM: Probability Statist., 4 (2000), 205–227. 1 [16] A. Gloter, Parameter estimation for a discretely observed integrated diffusion process, Scand. J. Statist., 33 (2006), 83–104. 1 [17] A. Gloter, E. Gobet, LAMN property for hidden processes: The case of integrated diffusions, Ann. de l’Institut Henri Poincaré, 44 (2008), 104–128. 1 [18] E. Gobet, M. Hoffmann, M. Reiss, Nonparametric estimation of scalar diffusions based on low frequency data, Ann. Statist., 32 (2004), 2223–2253. 1 [19] P. Hall, M. C. Jones, Adaptive M-estimation in nonparametric regression, Ann. Statist., 18 (1990), 1712–1728. 1 [20] P. J. Huber, Robust regression, Ann. Statist., 1 (1973), 799–821. 1 [21] J. C. Jiang, Y. P. Mack, Robust local polynomial regression for dependent data, Statist. Sinica, 11 (2001), 705–722. 1 [22] J. Nicolau, Nonparametric estimation of scend-order stochastic differential equations, Economet. Theory, 23 (2007), 880–898. 1, 2.1, 2.1, 2.2, 3, 4.1, 4.2 [23] L. C. G. Rogers, D. Williams, Diffusions, Markov processes, and Martingales, Vol 2, Cambridge University Press, Cambridge, (2000). 1, 4 [24] H. C. Wang, Z. Y. Lin, Local linear estimation of second-order diffusion models, Commun. Statist., 40 (2011), 394–407. 1 [25] Y. Y. Wang, L. X. Zhang, M. T. Tang, Re-weighted functional estimation of second-order diffusion processes, Metrika., 75 (2012), 1129–1151. 1, 2.1