ON THE ESTIMATION OF DISTRIBUTION FUNCTION ON INDIRECT SAMPLE Elizbar Nadaraya1, Petre Babilua2, Grigol Sokhadze3 Iv. Javakhishvili Tbilisi State University, Tbilisi, Georgia elizbar.nadaraya@tsu.ge, 2petre.babilua@tsu.ge, 3grigol.sokhadze@tsu.ge 1 1. Let X 1 , X 2 ,, X n be a sample of independent observations of a non-negative random value X with a distribution function F x . In problems of the theory of censored observations the sample values are pairs of random values Yi X i ti and Z i I Yi X i , i 1, n , where t i are given numbers ( t i t j for i j ) or random values independent of X i , i 1, n . Throughout the paper I denotes the indicator of the set A. We will consider here several different cases: the observer has an access only to random 2i 1 , i 1, n , cF inf x : F x 1 . 2n The problem consists in estimating distribution functions F x by the sample 1 , 2 ,, n . Such a problem arises, for example, in corrosion investigations (see [1] where an values i I X i ti , t i c F experiment connected with corrosion is described). We will consider estimates for F x that are analogous to regression curve estimates of Nadaraya-Watson type and have the form 1 n x ti x ti Fˆn x Fn1 x Fn 2 x , Fn1 x K , F x K i , n2 h i 1 i1 h where K x is some weight function (kernel), h hn is a sequence of positive numbers n converging to zero. Lemma. Assume that 10. K x is some distribution density with bounded variation and K x K x , x R , . If nh , then c 1 n m11 x t j m2 1 1 F m11 x u m2 1 1 K F t K F u du O j nh j 1 cF h 0 h nh h (2) uniformly with respect to x 0, cF , m1 , m2 are natural. Without loss of generality we assume below that the interval 0, cF 0,1. Theorem 1. Let F x be continuous and the conditions of the lemma be fulfilled. Then the estimate Fˆn x is asymptotically unbiased and consistent at all points x 0,1. Moreover, Fˆn x is distributed asymptotically normally, i.e. d nh Fˆ x EFˆ x 1 x N 0,1, n n x F x 1 F x K 2 u du, 2 , where d denotes convergence in distribution, and N 0,1 a random value having a normal distribution with mean 0 and variance 1. 1 2. Uniform consistency. We define the conditions under which the estimate Fˆn x converges uniformly in probability (almost surely) to a true F x . Following E. Parzen [2], we introduce the Fourie transform of K x t eitx K x dx and assume that 20. t is absolutely integrable. Then Fn1 x can be written in the form Fn1 x 1 2 e iu x h u t j iu 1 n h e j du . nh j 1 Thus 1 Fn1 x EFn1 x 2 Denote e iu x h t j iu 1 n u j F t j e h du . nh j 1 d n sup Fˆn x EFˆn x , n h ,1 h , 0 1 . xn Theorem 2. Let K x satisfy conditions 10 and 20. 1 (a) Let F x be continuous and n 4 hn , then p Dn sup Fˆn x F x 0; xn (b) If p n 4 hn , p 4 , then Dn 0 almost surely. n1 Proof. We have 1 1 1 x u h sup 1 K du K u du K u du 0 x n h 0 h h 1 This and this lemma imply sup Fn 2 x 1 0 (1) (2) x n i.e., due to uniform convergence, for any 0 0 , 0 0 1, and sufficiently large n n0 we have Fn 2 x 1 0 uniformly with respect to x n . Denote t j iu 1 n h An sup e j , j j F t j . uR nh j 1 Then d p n 1 0 p 2 p p n A u du Note that 2 p , p 4. (3) p n tk t j 2 1 2 u EAnp E sup cos j k j h nh p uR j 1 k j p n 1 k j 2 2 E sup cos u j k j nh p uR nh j 1 k j p 2 n m n n1 1 m 2 E sup j j m cos u . j p nh uR j1 nh m n1 j 1 m0 From this, by the inequality q n n a j n q1 a j , q 1 , j 1 q j 1 we have EAnp p 1 2 p 1 2 p 2 n m n 1 2 n 2 2 m E E sup cos u j j m i nh p nh p uR m nh j 1 i 1 n 1 p 2 C n1 C n 2 . (4) m0 Let us estimate C n1 and C n 2 : p 22 Cn1 n 1 p 1 2 p n E j h p 22 p j 1 n 1 p 1 2 h 1 F t F t F t 1 F t c n p p j 1 1 p j j j j 2 p 2 n h . (5) p Further, using Whittle’s inequality [3] for moments of quadratic form, we obtain p Cn 2 22 1 p 2n 3 2 1 n1 E n m 2 , j j m nhp m n1 j 1 p m0 thus p 2 n m E j 1 j where c p depends only on p and E j c p n m 4 , p j m p 1. Thus n m n 1 m n 1 m0 E j j m j 1 p 2 p n 1 p 1 2c p m 4 O n 4 m 1 and 1 C n 2 O p 4 p n h . After combining the relations (3), (4), (5) and (6), we obtain 3 (6) 1 Ed O p 4 p n h p n , p 4 . Therefore P sup Fˆn x EFˆn x xn c3 p 4 n h p . (7) p Further we obtain sup EFˆn x F x x n 1 sup EFn1 x F x sup 1 Fn 2 x . 1 0 xn x n (8) The second summand in the right-hand part of (8) tends, by virtue of (2), to zero, while the first summand is estimated as follows: 1 sup EFn1 x F x S n1 S n 2 O , x n nh (9) 1 x y S n1 sup F y F x K dy , 0 x 1 h 0 h 1 Sn2 1 1 x y sup 1 K dy , h h x n 0 and, by virtue of (1), Sn2 0 . (10) Now let us consider S n1 . Note that 1 x y 1 u Sn1 sup F y F x K dy sup F x u F x K du h h h h 0 x1 0 0 x1 x1 1 x sup u F x u F x h K h du . 1 (11) 0 x 1 Assume that 0 and divide the integration domain in (11) into two domains u and u . Then S n1 sup 1 u 1 u F x u F x h K h du sup F x u F x h K h du 0 x1 u 0 x1 u sup sup F x u F x 2 xR u K u du . u h (12) By a choice of 0 the first summand in the right-hand part of (12) can be made arbitrarily small. After choosing 0 and making n tend to infinity, we obtain that the second summand tends to zero. Thus lim S n1 0 . (13) n Finally, the proof of the theorem follows from the relations (7)-(10) and (13). Remarks. 1) If K x 0 , x 1 and 1, i.e., n h,1 h , then S n 2 0 . 2) Under the conditions of Theorem 2, 4 sup Fˆn x F x 0 xa ,b in probability (almost surely) for any fixed interval a, b 0,1 since there exists n0 such that a, b n , n n0 . Let us assume that h n , 0 . The conditions of Theorem 2 are fulfilled: 1 n 4 hn if 1 4 and n n 1 p 4 hn p if 1 1 , p 4. 4 p 3. Estimation of moments. In the considered problem there naturally arises the question of estimation of integral functional of F x , for example, of moments m , m 1 : 1 m m t m1 1 F t dt . 0 As estimates for m we will consider the statistics ˆ nm 1 m n 1 j n j 1 h 1h t h m1 t tj K h 1 Fn 2 t dt . Theorem 3. Let K x satisfy condition 10 and, in addition to this, K x 0 outside the interval 1,1 . If nh as n , then ̂ nm is an asymptotically unbiased, consistent estimate for m and, moreover, n ˆ nm Eˆ nm 1 d N 0,1, 2 m 2 t 2 m2 F t 1 F t dt . 0 References 1. Manjgaladze, K.V. On one estimate of a distribution function and its moments. (Russian) Bull. Acad. Sci. Georgian SSR 124 (1980), No. 2, pp. 261-268. 2. Parzen, E. On estimation of a probability density function and mode. Ann. Math. Statist. 33 (1962), pp. 1065-1076. 3. Whittle, P. Bounds for the moments of linear and quadratic forms in independent variables. Teor. Verojatnost. i Primenen. 5 (1960), pp. 331-335. 5