Journal of Mathematical Sciences, Vol. 220, No. 6, February, 2017 ESTIMATION OF DISTRIBUTIONS UNDER DOSE-EFFECT DEPENDENCE WITH FIXED EXPERIMENT PLAN M. S. Tikhov1 , D. S. Krishtopenko1 , and M. V. Yaroschuk1 In the present paper we find the limit distributions of the Nadaraya–Watson estimators and asymptotically unbiased estimators of a distribution function under dose-effect dependence with fixed experiment plans. We prove the asymptotic normality of integrated square errors of the distribution function’s estimators. Introduction The goal of the present paper is to study the asymptotic properties of the distribution under doseeffect dependence in the case where injected doses are not random but fixed beforehand. Studying such statistics is of interest in toxicometry for the determination of average effective doses (see [1]). These results may be used for the construction of tests for goodness-of-fit and homogeneity hypotheses under dose-effect dependence. 1. Nonparametric estimators of distribution function Let Z = {(Xi , Ui ), i > 1} be a stationary sequence of independent pairs of random variables with the joint distribution function F (x)G(x) and density f (x)g(x). We observe the sample U (n) = {(Ui , Wi ), 1 6 i 6 n}, where Wi = I(Xi < Ui ) is the indicator of an event (Xi < Ui ). The problem is to estimate the distribution function F (x) from the sample U (n) . To statistically estimate F (x), according to Nadaraya and Watson (see, e.g., [2, 3]), the values F̂n (x) = are used, where ∗ (x) S2n ∗ (x) S1n n ∗ S1n (x) 1X Kh (Ui − x), = n i=1 n ∗ S2n (x) 1X = Wi Kh (Ui − x), n i=1 the kernel K(·) > 0 is defined on R, Kh (x) = (1/h)K(x/h), and h > 0 is the smoothing parameter. √ The asymptotic behavior of the empirical process nh(F̂n (x) − F (x)) was studied in [4], where the asymptotic normality of the statistic F̂n (x) was established. Let us note that the estimator F̂n (x) is asymptotically biased. In [5] the asymptotically unbiased estimator of a distribution function under dose-effect dependence is proposed, which has the same limit variance as the one in [4]. In these papers U is interpreted as the dose injected into the body, and X is the minimal dose, at which the organism begins to react, and U is considered to be a random variable. However, quite often the injected dose is chosen beforehand, i.e., it is not a random variable, and in the present paper we study the asymptotic behavior of the proposed statistic in this situation. Let us introduce the functions n 1X Kh (ui − x), S1n (x) = n i=1 1 n 1X S2n (x) = Wi Kh (ui − x), n i=1 Lobachevksy State University of Nizhni Novgorod, Nizhni Novgorod, Russia, e-mail: tikhovm@mail.ru Translated from Statisticheskie Metody Otsenivaniya i Proverki Gipotez, Vol. 19, pp. 66–77, 2006 1072-3374/17/2206-0753 © 2017 Springer Science+Business Media New York 753 754 M. S. Tikhov, D. S. Krishtopenko, and M. V. Yaroschuk where ui are fixed values, and Wi = I(Xi < ui ) is the indicator of the event (Xi < ui ). We assume that a 6 ui 6 b are given, R 2and, without Rloss2 of generality, consider the case where a = 0, b = 1 and ui = i/n. 2 2 Let k K k = K (x) dx, ν = x K(x) dx. We assume that the following conditions are satisfied: ( A ): ( A1 ) K(x) > 0 is a bounded even function, and k K k2 < ∞; ( A2 ) RK(x) = 0 for x ∈ / [−1, 1]; ( A3 ) K(x) dx = 1, ν 2 < ∞; ( A4 ) the function f (x) is continuously differentiable, the derivatives f ′ (x) and f ′′ (x) are bounded, Z (f ′ (x))2 dx < ∞; R ( A5) f (x)/F (x), f ′ (x)f (x)/F (x) are bounded integrable functions, and (f ′ (x))4 dx < ∞. R Let us consider S1n (x), which is the integral sum for A = (1/h) K(x/h) dx, and find the estimate of deviation of S1n (x) from A. Let us present the Koksma–Hlavka inequality (KH) (see [6, 7]) Z1 0 f (u) du − n _ 1X f (ui ) 6 (f )D ∗ (Pn ), n i=1 W W where (f ) is the variation of the function f (we assume that (f ) is finite: a sufficient condition of W finiteness of variation (f ) is the boundedness of the derivative f ′ (x)), Pn = {u1 , u2 , . . . , un }, D ∗ (Pn ) = Z sup J = [ 0, a )⊂A IJ (x) dx − A n 1X IJ (xi ) . n i=1 For the sequence Pn = {1/n, 2/n, . . . , (n − 1)/n, 1} we have D ∗ (Pn ) = 1/(2n), i.e., Z1 0 f (u) du − n _ 1X 1 f (ui ) 6 (f ) · . n 2n i=1 From the KH inequality it follows that 1 Z n _ 1X 1 Kh (i/n − x) − Kh (u − x) du 6 (K) · . n n i=1 But for h → 0 0 Z1 0 Kh (u − x) du = Z1 K(t) dt = 1. −1 Therefore, if h → 0 as n → ∞, then the sequence S1n (x) converges to 1. Now consider the sum S2n (x). We have n 1X S2n (x) = Wi Kh (i/n − x). n i=1 Estimation of Distributions under Dose-Effect Dependence with Fixed Experiment Plan 755 Its mathematical expectation, when n → ∞ and h → 0, equals E(S2n (x)) = n 1X F (i/n)Kh (i/n − x) = n i=1 = Z1 0 Z1 1 1 F (u)Kh (u − x) du + O = F (x + ht)K(t) dt + O = n n −1 f ′ (x)h2 = F (x) + 2 Z1 2 2 t K(t) dt + o(h ) + O −1 = F (x) + f ′ (x)h2 2 2 2 ν + o(h ) + O The variance of the sum S2n (x) equals D 1 nh 1 nh = . ! n 1X Wi Kh (i/n − x) = n i=1 = n 1 X F (i/n)(1 − F (i/n))Kh2 (i/n − x) = n2 i=1 1 = n Z1 0 F (u)(1 − F (u))Kh2 (u − x) du(1 + o(1)) = 1 F (x)(1 − F (x)) k K k2 (1 + o(1)). nh Theorem 1. Let conditions (A) be satisfied, and h = Cn−1/5 . Then, as n → ∞, √ d nh(F̂n (x) − F (x)) −→ N (a(x), σ 2 (x)), where a(x) = (1/2)f ′ (x)ν 2 and σ 2 (x) = F (x)(1 − F (x)) k K k2 . Proof. The asymptotic normality of F̂n (x) follows from the previous considerations and the boundedness of the terms Wi K(i/n − x) (see [8]). Now let us construct the estimator of the distribution function: 1) set h0 = C1 n−α and calculate n 1X Wi Kho (i/n − x); ϕ̃(x) = n i=1 2) taking h1 = C2 n−1/5 , we estimate the function β(x) with the use of n 1 X 1 β̂(x) = ; Wj Kh1 (j/n − x) n ϕ̃(j/n) j =1 3) multiplying the estimator β̂(x) by the estimator ϕ̃(x), we obtain n ϕ̃(x) 1 X Wj Kh1 (j/n − x) . ϕ̂(x) = β̂(x)ϕ̃(x) = n ϕ̃(j/n) j=1 P Let fˆn,h = fˆn,h(x) = n−1 ni= 1 Kh (x − Xi ). Let us present the result from [9], which we use in what follows. 756 M. S. Tikhov, D. S. Krishtopenko, and M. V. Yaroschuk Theorem 2. If conditions (A1)–(A3) for the function K(x) are satisfied, and if f (x) is bounded, then for any c > 0 with probability 1 we have √ nh k fˆn,h − E(fˆn,h )k∞ p = k(c) < ∞, lim sup sup n→∞ c ln n/n6h61 max(ln(1/h), ln ln n) where k K k∞ = sup | K(x) | < ∞. x The following statement holds. Theorem 3. Let conditions (A) be satisfied, and 1/10 < α < 1/5. Then p d nh1 (ϕ̂(x) − F (x)) −→ N (0, σ 2 (x)). n→∞ Proof. Let ϕ̄(x) = E(ϕ̃(x)). Then ϕ̃(x) ϕ̄(x) − ϕ̃(j/n) ϕ̄(j/n) = 6 = ϕ̃(x)ϕ̄(j/n) − ϕ̄(x)ϕ̃(j/n) ϕ̃(x)ϕ̄(j/n) ϕ̄(j/n)(ϕ̃(x) − ϕ̄(x)) + ϕ̄(x)(ϕ̄(j/n)ϕ̃(j/n)) ϕ̃(x)ϕ̄(j/n) = 6 ϕ̄(x) 1 | ϕ̃(x) − ϕ̄(x) | + | ϕ̃(j/n) − ϕ̄(j/n) |. ϕ̃(j/n) ϕ̃(j/n)ϕ̄(j/n) From Theorem 2 it follows that ∆n1 = sup | ϕ̃(x) − ϕ̄(x) | = O x r ∆n2 = sup | ϕ̃(j/n) − ϕ̄(j/n) | = O 16j6n and hence, p p p nh1 ∆n1 −→ 0, n→∞ ln n nh0 r ! , ln n nh0 ! , p nh1 ∆n2 −→ 0, n→∞ since α < 1/5. In this regard, we will study the asymptotic behavior of the sums n 1X ϕ̄(x) S3n (x) = Wj Kh1 (x − j/n) . n ϕ̄(j/n) i=1 First, let us consider the mathematical expectation of S3n (x). We have n 1X ϕ̄(x) E(S3n (x)) = F (j/n)Kh1 (x − j/n) ∼ n ϕ̄(j/n) i=1 ∼ Z F (u)Kh1 (x − u) ϕ̄(x) du = ϕ̄(u) Z K(t)F (x + h1 t) ϕ̄(x) dt. ϕ̄(x + h1 t) Expanding F (x + h1 t)/ϕ̄(x + h1 t) and ϕ̄(x) into series in h0 t and calculating their product, we obtain ϕ̄(x) t2 ν 2 ′ h2 t2 ν 2 f ′ (x + h1 t) F (x + h1 t) = F (x) + h20 f (x) − 0 F (x) + o(h20 ). ϕ̄(x + h1 t) 2 F (x + h1 t) Estimation of Distributions under Dose-Effect Dependence with Fixed Experiment Plan 757 Then, expanding the ratio f ′ (x + h1 t)/F (x + h1 t) into series in h1 t, we obtain F (x) f ′ (x)f (x) f ′ (x + h1 t) = f ′ (x) + h1 tf ′′ (x) − h1 t + o(h1 ). F (x + h1 t) F (x) Finally, taking into account conditions (A), we obtain E(S3n (x)) = F (x) + O(n−1/5−2α ) −→ 0. n→∞ Remark 1. Restriction on α may be weakened to 0 < α < 1/5 if we require that the following conditions be satisfied: the boundedness of functions f ′′′ (x)/F (x), f (x)f ′′ (x)/F 2 (x), f (x)f ′ (x)/F 2 (x), (f ′ (x))2 /F 2 (x). Next, n 1 X ϕ̄2 (x) D(S3n (x)) = 2 F (j/n)(1 − F (j/n))Kh21 (x − j/n) 2 ∼ n ϕ̄ (j/n) j =1 ∼ 1 = nh1 Z 1 n Z F (u)(1 − F (u))Kh21 (x − u) F (x + h1 t)(1 − F (x + h1 t))K 2 (t) ∼ ϕ̄2 (x) du = ϕ̄2 (u) ϕ̄2 (x) dt ∼ ϕ̄2 (x + h1 t) 1 F (x)(1 − F (x))kKk2 , nh1 since ϕ̄2 (x + h1 t) −→ ϕ̄2 (x) uniformly in | t | 6 t0 . n→∞ In view of the boundedness of the functions F (x) and K(x) it is easy to show that the Lindeberg condition is satisfied (see [10]) and, hence, the statement of Theorem 3 follows. 2. The distribution of integrated square errors of nonparametric estimators of distribution functions Let X1 , X2 , . . . , Xn be independent identically distributed random variables with unknown continuous distribution F (x). We observe the sample {(Wi , i/n), i = 1, 2, . . . , n}, where i/n are injected nonrandom doses and Wi = I(Xi < i/n) is the indicator of the event (Xi < i/n). Consider the sequence of Nadaraya–Watson estimators of the form F̂n (x) = S2n (x)/S1n (x), S2n = S2n (x) = n n 1X 1X Wi Kh (x − i/n), S1n = S1n (x) = Kh (x − i/n). n n i=1 p i=1 Since S1n −→ 1, we will study the behavior of integrated square errors of the estimators of distrin→∞ bution function of the form Fn (x) = S2n (x) = S2n which is given by the formula Z Z In = (Fn (x) − F (x))2 ω(x) dx = (Fn (x) − E(Fn (x)))2 ω(x) dx+ +2 + Z Z (Fn (x) − E(Fn (x)))(E(Fn (x)) − F (x)) ω(x) dx+ (E(Fn (x)) − F (x))2 ω(x) dx. 758 M. S. Tikhov, D. S. Krishtopenko, and M. V. Yaroschuk Without loss of generality, we assume that ω(x) ≡ 1 and study each term of this expression separately, starting with In1 : n P 1 (i) In1 = h2 n−1 Jn1 , Jn1 = 1/2 2 Zn1i , where n h i=1 Zn1i = (Wi − F (i/n)) ∼ ν 2 h2 2 Z Kh (x − i/n)f ′ (x) dx ∼ ν 2 h2 (Wi − F (i/n))f ′ (i/n) 2 when n → ∞. Lemma 1. Let conditions (A) be satisfied. Then the sequence Jn1 is asymptotically (when n → ∞) normal with parameters (0, σ12 ), where σ12 = (1/4)ν 4 Z1 0 F (x)(1 − F (x))(f ′ (x))2 dx < ∞. Proof. We have: E(Jn1 ) = 0, and when n → ∞, n ν4 X D(Jn1 ) ∼ F (i/n)(1 − F (i/n))(f ′ (i/n))2 −→ σ12 < ∞ n→∞ 4n i=1 by condition (A5). Next, n n X 1 1 X 2 1/2 2 4 E(Z I( | Z | > εn h )) 6 E(Zn1i )= n1i n1i nh4 ε2 n2 h8 i=1 = i=1 n ν8 X 4 F (i/n)(1 − F (i/n) + F (i/n)(1 − F (i/n)4 )(f ′ (i/n))4 6 16ε2 n2 i=1 Z1 n ν8 X ′ ν8 4 6 (f (i/n)) ∼ (f ′ (x))4 dx −→ 0 n→∞ 64ε2 n2 n i=1 0 by condition (A6). From here by the central limit theorem (see [10, the Lindeberg theorem]) the sequence Jn1 = n 1/2 −2 h Z (Fn (x) − E(Fn (x)))(E(Fn (x)) − F (x)) dx is asymptotically normal with parameters (0, σ12 ) as n → ∞. Now let us study the asymptotic behavior of expression In2 : R 1 Pn 2 K 2 (x − i/n) dx. (ii) In2 = n−1 Jn2 , Jn2 ≡ i = 1 (Wi − F (i/n)) h n Lemma 2. Let conditions (A) be satisfied, and when n → ∞, h → 0, nh2 → ∞. Then p Jn2 −→ σ22 . n→∞ Estimation of Distributions under Dose-Effect Dependence with Fixed Experiment Plan 759 Proof. We have Z n 1X F (i/n)(1 − F (i/n)) Kh2 (x − i/n) dx ∼ E(Jn2 ) = n i=1 ∼ Z1 0 F (u)(1 − F (u)) du 1 D(Jn2 ) = 2 n Z n X i=1 Kh2 (x 2 − u) dx ∼ k K k 4 E(Wi − F (i/n)) Z Z1 0 F (x)(1 − F (x)) dx = σ22 , Kh2 (x − i/n) dx 2 = n 1 X (F (i/n)4 (1 − F (i/n)) + F (i/n)(1 − F (i/n))4 )× = 2 n i=1 × Z Kh2 (x − i/n) dx 2 2 n Z kKk4 1 X 2 Kh (x − i/n) dx ∼ 6 2 −→ 0 4n 4n n→∞ i=1 by condition (A1). Then the statement of the lemma follows from the Tchebyshev inequality. Now let us consider the behavior of In3 : ( iii ) In3 = n−1 h−1/2 Jn3 , Z X 2 (Wi − F (i/n))(Wj − F (j/n)) Kh (x − i/n)Kh (x − j/n) dx. Jn3 = nh1/2 16i<j6n Lemma 3. Let conditions (A) be satisfied. Then as n → ∞, h → 0, n2 h → ∞, the sequence Jn3 is 2 ), where σ 2 = 2n−2 h−1 σ 2 , asymptotically normal with parameters (0, σn3 3 n3 σ32 = Z 2 2 F (x)(1 − F (x)) dx Z Z K(u)K(u + v) du 2 dv. Proof. Define the variables ξni = n X (Wi − F (i/n))(Wj − F (j/n)) · j = i+1 Then Jn3 = Z Kh (x − i/n)Kh (x − j/n) dx. n−1 2 X ξni . nh1/2 j = 1 Let Fk = σ(X1 , X2 , . . . , Xk ) be the σ-algebra generated by the random variables X1 , X2 , . . . , Xk . Then {ξnk , Fk }16k6n , n > 1, is a martingale difference (see [11]), since E | ξnk | < ∞ and E(ξnk | Fk ) = 0. To prove the asymptotic normality of Jn3 , it is necessary to show (see [11, Theorem 8 (II)]) that (a) n−1 1 X p 2 E(ξni I(| ξni | > δnh1/2 ) | Fi−1 ) −→ 0, n→∞ n2 h i=1 (b) n−1 1 X p 2 E(ξni | Fi−1 ) −→ σ32 . 2 n→∞ n h i=1 δ ∈ (0, 1); 760 M. S. Tikhov, D. S. Krishtopenko, and M. V. Yaroschuk We have n X 2 ξni = (Wi − F (i/n))2 (Wj − F (j/n)) j = i+1 Z 2 Kh (x − i/n)Kh (x − j/n) dx , 2 E(ξni | Fi−1 ) = F (i/n)(1 − F (i/n))× ×D n X j = i+1 (Wj − F (j/n)) = F (i/n)(1 − F (i/n)) × Z n X j = i+1 Z Kh (x − i/n)Kh (x − j/n) dx = F (j/n)(1 − F (j/n))× Kh (x − i/n)Kh (x − j/n) dx 2 . Hence, when n → ∞, h → 0, n n−1 n−1 X 1 X 1 X 2 F (j/n)(1 − F (j/n))× E(ξni | Fi−1 ) = 2 F (i/n)(1 − F (i/n)) n2 h n h i=1 j = i+1 i=1 × Z1 ∼ 0 ∼ = Kh (x − i/n)Kh (x − j/n) dx Z1 2 F (u)(1 − F (u)) du h × σ32 Z Z 2 Z u ∼ F (v)(1 − F (v)) dv× Kh (u − x)Kh (v − x) dx 2 2 F (x)(1 − F (x)) dx Z Z 2 ∼ K(u)K(u + v) du 2 dv; therefore condition (b) is satisfied. Moreover, n−1 n−1 X 1 X 1 4 2 1/2 E(ξni | Fi−1 ). E(ξ I(| ξ | > δnh ) | F ) 6 ni i−1 ni n2 h δ2 n4 h2 i=1 i=1 Consider the sum on the right-hand side of this inequality. We have 4 E(ξni | Fi−1 ) = E(Wi − F (i/n))4 × ×E n X (Wj − F (j/n)) j = i+1 6 E(Wi − F (i/n))4 E Z 4 Kh (x − i/n)Kh (x − j/n) dx 6 n X 4 (Wj − F (j/n)) 6 j = i+1 Estimation of Distributions under Dose-Effect Dependence with Fixed Experiment Plan 4 n X 6 16 E (Wj − F (j/n)) . 761 j = i+1 But E n X 4 (Wj − F (j/n)) = E j = i+1 +6 X i+16j<l6n n X = n X (Wj − F (j/n))4 + j = i+1 E (Wj − F (j/n))2 (Wl − F (l/n))2 = ((F (j/n))4 (1 − F (j/n)) + F (j/n)(1 − F (j/n))4 )+ j = i+1 +6 n X j = i+1 Hence, 2 n F (j/n)(1 − F (j/n)) 6 + 6 4 n−1 1 X 2 E(ξni I(| ξni | > δnh1/2 ) | Fi−1 ) 6 n2 h n X j = i+1 2 F (j/n)(1 − F (j/n)) . i=1 16 1 +6 6 nh2 4n n X j = i+1 2 16 F (j/n)(1 − F (j/n)) ∼ nh2 1 3 + 4n 8 −→ 0. n→∞ So, in this case, condition (a) is satisfied. Now from [11] it follows that the sequence In3 is asymptotically normal with parameters (0, σ32 ). Remark 2. For Epanechnikov’s kernel K(x) = (3/4)(1 − x2 )I(| x | 6 1) the convolution equals (K ∗ K)(x) = K2 (x) = R(3/160)(32 − 40x2 + 20x3 − x5 ) for 0 6 x 6 2 and (K ∗ K)(x) = K2 (−x) for R −2 6 x 6 0. Therefore ( K(u)K(u + v) du)2 dv = 167/387 ≈ 0.434. ( iv ) We R have In = 2In1 + In2 + In3 . Let c(n) = E(Fn (x) − F (x))2 dx and µ(n) = c(n) + n−1 σ22 . From Lemmas 1–3 we derive the following theorem. Theorem 4. Under conditions (A), assuming that h → 0 and nh → ∞ when n → ∞, we have d n1/2 h−2 (In − µ(n)) −→ N (0, 4ν 2 σ12 ), if n→∞ d nh−1/2 (In − µ(n)) −→ N (0, 2σ32 ), n→∞ if nh5 → ∞; nh5 → 0; d nh−1/2 (In − µ(n)) −→ N (0, 4ν 2 σ12 λ4/5 + 2σ32 λ−1/5 ), n→∞ if nh5 → λ ∈ (0, +∞). REFERENCES 1. S. V. Krishtopenko, Toxicometry of Effective Doses, Nizhny Novgorod University Press, Nizny Novgorod (1997). 2. E. A. Nadaraya, “On regression estimate,” Theory Probab. Appl., 9, No. 1, 157–159 (1964). 3. G. S. Watson, “Smooth regression analysis,” Sankhyā, 26, 359–372 (1964). 762 M. S. Tikhov, D. S. Krishtopenko, and M. V. Yaroschuk 4. M. S. Tikhov, “Statistical estimation on the basis of interval-censored data,” J. Math. Sci., 119, No. 3, 321–335 (2004). 5. M. S. Tikhov and I. S. Dolgih, “Asymptotically unbiased estimates of distribution function under dose effect dependence,” J. Math. Sci., 205, No. 1, 113–120 (2015). 6. H. Niederreiter, Random Number Generation and Quasi-Monte Carlo Methods, SIAM Philadelphia, Pensylvania (1992). 7. E. Hlawka, “Lösung von Integralgleichungen mittels zahlentheoretisher Methoden I,” Siztzungsber., Abt. II. Österr. Akad. Wiss., Math.-Naturwiss. Kl., 171, No. 1, 103–123 (1962). 8. M. Loeve, Probability Theory, Van Nostrand, Princeton (1960). 9. U. Einmahl and D. M. Mason, “Uniform in bandwidth consistency of kernel-type function estimators,” Ann. Statist., 33, No. 3, 1380–1403 (2005). 10. B. V. Gnedenko, Course in Probability Theory, Editorial URSS, Moscow (2005). 11. R. S. Liptser and A. N. Shiryaev, Theory of Martingales, Springer, Amsterdam (1989).