Document 10948610

advertisement
Hindawi Publishing Corporation
Journal of Probability and Statistics
Volume 2012, Article ID 131085, 20 pages
doi:10.1155/2012/131085
Research Article
Least Absolute Deviation Estimate for Functional
Coefficient Partially Linear Regression Models
Yanqin Feng,1 Guoxin Zuo,2 and Li Liu1
1
2
School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China
School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China
Correspondence should be addressed to Yanqin Feng, yanqf2008@yahoo.com.cn
Received 5 August 2012; Revised 10 October 2012; Accepted 1 November 2012
Academic Editor: Artur Lemonte
Copyright q 2012 Yanqin Feng et al. This is an open access article distributed under the Creative
Commons Attribution License, which permits unrestricted use, distribution, and reproduction in
any medium, provided the original work is properly cited.
The functional coefficient partially linear regression model is a useful generalization of the
nonparametric model, partial linear model, and varying coefficient model. In this paper, the local
linear technique and the L1 method are employed to estimate all the functions in the functional
coefficient partially linear regression model. The asymptotic properties of the proposed estimators
are studied. Simulation studies are conducted to show the validity of the estimate procedure.
1. Introduction
In this paper, we are concerned with a functional coefficient partially linear regression model
FCPLR, that is,
Y a0 X p
aj UZj ε,
1.1
j1
where X and U are random explanatory variables, Z Z1 , . . . , Zp is a random vector, and
aj · is some measurable function from R to R for j 0, . . . , p. We call a0 · the intercept
function, and {aj ·}, j 1, . . . , p, the coefficient functions. As usual, ε denotes the errors with
zero-mean and fixed variance.
The FCPLR model, first introduced by Wong et al. 1, is a generalization of the
nonparametric model, partial linear model, and varying coefficient model. Zhu et al. 2
studied a similar functional coefficient model, a functional mixed model, using a new
Bayesian method. Model 1.1 reduces to a varying coefficient regression mode if the intercept
2
Journal of Probability and Statistics
function a0 · is a constant function and a partially linear regression model when the
p
coefficient functions {aj ·}j1 are constants. Many researchers, for example, Aneiros-Pérez
and Vieu 3, made contributions to studying this kind of model. When X and U in model
1.1 coincide, the model becomes a semiparametric varying coefficient model discussed
by Ahmad et al. 4. Since the FCPLR model combines the nonparametric and functional
coefficient regression model, its flexibility makes it attractable in various regression problems
1.
Statistical inference for the FCPLR model mainly includes the estimations of the
p
intercept function a0 · and the coefficient functions {aj ·}j1 . To estimate the unknown
functions in the nonparametric/semiparametric regression models, many statistical inference
methods have been proposed over the past decades, such as the kernel estimate method
5–7, spline smoothing 8, 9, and two-step estimation method 10, 11. Wong et al.
1 employed local linear regression method and integrated method to give the initial
estimates of all functions in the FCPLR model. All the papers mentioned above used the
least-squares technique to obtain the estimators of the unknown coefficient functions. The
least-squares estimators L2 method, of course, have some good properties, especially
for the normal random errors case. It is well known that, however, the least-squares
method will perform poor when the random errors have a heavy-tailed distribution in
that it is highly sensitive to extreme values and outliers. This motivates us to find more
robust estimation methods instead of the aforementioned inference methods for model
1.1.
Local linear approximation is a good method for nonparametric regression problems
12, and the L1 method based on the least absolute deviations overcomes the sensitivity
caused by outliers. As noted in Wang and Scott 13 and Fan and Gijbels 12, among
many robust estimation methods, the L1 method based on the local least absolute deviations
behaves quite well. In this paper, we adopt the L1 method, accompany with the local linear
technique and the integrated method to estimate all the unknown functions in model 1.1.
Furthermore, the estimating problem can be reduced to a linear programming problem, and
the numerical solutions are obtained quickly by some available softwares subsequently e.g.,
Matlab is very useful for this kind of problems. The main difficulty encountered in the proof
of the asymptotic normalities is that the L1 estimates have no closed form. This paper shows
the asymptotic normalities of L1 estimators through a method completely different from those
based on the L2 method, and the simulation results show that the L1 method is a robust
method indeed.
The rest of this paper is organized as follows. In Section 2, we describe the estimation
method and the associated bandwidth selection procedure. Section 3 gives the the asymptotic
theories of the estimators. Simulation studies are conducted in Section 4. A real application is
given in Section 5. Section 6 gives the proofs of the main results.
2. Least Absolute Deviation Estimate
This section gives the main idea of the proposed estimation method; that is, local linear
polynomials are used to approximate the nonparametric function and the functional
coefficients, and the least absolute deviation technique is used to find the best approximation.
Bandwidth selection technique is also discussed in this section. Throughout this paper, we
suppose that {Yi , Xi , Ui , Zi }ni1 is an i.i.d. sample from model 1.1 and assume the following
conditions.
Journal of Probability and Statistics
3
Assumptions
1 Ω EZZ | X x0 , U u0 is a positive definite matrix, EZ | X x0 , U u0 ω.
2 Bandwidth h subjects to h ∼ n−1/6 .
3 Random error ε, with zero mean and zero median, is independent of Z conditional
on X, U. The conditional probability density g·|x, u of ε given random variables
X and U is continuous in a neighborhood of the point 0, and g0|x, u > 0. ε1 , . . . , εn
are independent and identically distributed.
4 The density functions fx, u, f1 x, f2 u of X, U, X, and U are continuous in
neighborhoods of x0 , u0 , x0 and u0 , and fx0 , u0 > 0.
5 All functions aj u, j 1, . . . , p and a0 x are twice continuously differentiable in
neighborhoods of u0 and x0 , respectively.
6 Kernel functions Kk ·, k 1, 2 are bounded, nonnegative, and compactly
supported.
7 max1≤i≤n Zi op n1/3 .
To simplify typesetting, we introduce the following symbols:
μi k vi Kk vdv,
γi k vi Kk2 vdv
for k 1, 2. i 1, 2, . . . .
2.1
2.1. Local Linear Estimate Based on Least Absolute Deviation
The main idea is to approximate the functional coefficients aj · by linear functions for j 0, . . . , p, that is, a0 x can be approximated by
a0 x ≈ a0 x0 a0 x0 x − x0 2.2
for x in a neighborhood of x0 within the closed support of X, and aj u by
aj u ≈ aj u0 aj u0 u − u0 ,
j 1, . . . , p
2.3
for u in a neighborhood of u0 within the closed support of U. For simplicity, denoting a0 x0 ,
a0 x0 as a0 , b0 , and aj u0 , aj u0 as aj , bj for j 1, . . . , p. The local linear least absolute
deviation estimate L1 estimate of the unknown parameters a0 , b0 , a : a1 , . . . , ap and b :
is the optimal solution of the minimization
b1 , . . . , bp , denoted, respectively, by a0 , b0 , a, b,
problem as follows:
min
n Yi − a0 − b0 Xi − x0 − a Zi − b Ui − u0 Zi K1,h Xi − x0 K2,h Ui − u0 ,
i1
2.4
4
Journal of Probability and Statistics
where Kk,h · Kk ·/h, k 1, 2 are the given kernel functions and h is the chosen
bandwidth. The optimization problem is equivalent to the following linear programming
problem:
min
n
ei ei− K1,h Xi − x0 K2,h Ui − u0 i1
s.t.
a0 b0 Xi − x0 a Zi b Ui − u0 Zi ei − ei− Yi ,
ei ≥ 0,
ei− ≥ 0,
2.5
i 1, . . . , n.
There are many algorithms available for the optimal solution of problem 2.5; for example,
the feasible direction method can be directly used to compute the optimal solution 14, and
the numerical solution of 2.5 can be quickly computed by a series of Matlab functions.
By the integrated method 1, the estimator of the intercept function a0 x is defined
by
a0 x0 n
1
a0 x0 , Uk ,
n k1
2.6
and the estimators of the coefficient functions aj u0 , j 1, . . . , p are defined by
aj u0 n
1
aj Xk , u0 ,
n k1
j 1, . . . , p.
2.7
We focus our main task in establishing the asymptotic distributions of the estimators a0 x0 and aj u0 for j 1, . . . , p.
2.2. Selection of Bandwidth
It is well known that the choice of the bandwidth strongly influences the adequacy of the
estimators. We use an automatic bandwidth choice procedure in this paper; that is, the
absolute cross-validation ACV method, and the ACV bandwidth hACV is defined as
hACV
p
n 1
−k
−k
Yk − a0 Xk − aj Uk Zkj ,
arg min ACVh n k1 j1
2.8
−k
where a−k
0 Xk and aj Uk , j 1, . . . , p are constructed based on observations with size n − 1
by leaving out the kth observation Yk , Xk , Uk , Zk . According to Wang and Scott 13, this
bandwidth is better than cross-validation CV bandwidth hCV . The latter one is suggested
by Rice and Silverman 15 and often used in curve regression as in Hoover et al. 6 and Wu
et al. 7.
Journal of Probability and Statistics
5
3. Asymptotic Theory
This section gives the asymptotic distribution theories of the estimators. Using Taylor’s
expansion, for |Xi − x0 | ≤ Ch and |Ui − u0 | ≤ Ch, we have
1
a0 Xi a0 x0 a0 x0 Xi − x0 a0 ξi Xi − x0 2 ,
2
1 aj Ui aj u0 aj u0 Ui − u0 aj ηij Ui − u0 2 ,
2
3.1
for j 1, . . . , p,
where ξi is between x0 and Xi , and ηij is between Ui and u0 . Let au0 a1 u0 , . . . , ap u0 ,
bu0 a1 u0 , . . . , ap u0 , cu0 a1 u0 , . . . , ap u0 , then we have
a0 , b0 , a , b
n Yi − a0 − b0 Xi − x0 − a Zi − b Ui − u0 Zi K1,h Xi − x0 K2,h Ui − u0 i1 ⎧
n ⎨
nh2 a0 − a0 x0 b0 − a x0 Xi − x0 √ 1
arg min
0
⎩
nh2
i1
1
nh2 a − au0 Zi b − bu0 Ui − u0 Zi √
nh2 ⎞
⎡ ⎛
⎤
p
1
2
2
⎠
⎝
⎣
⎦
a0 ξi Xi − x0 aj ηij Ui − u0 Zij εi −
2
j1
⎛
⎫
⎞
⎬
p
1
2
2
⎠
⎝
−
a0 ξi Xi − x0 aj ηij Ui − u0 Zij εi 2
⎭
j1
arg min
× K1,h Xi − x0 K2,h Ui − u0 .
3.2
√
The aim of this paper is to study the asymptotic behavior of nh
a0 − a0 x0 and
√
nh
a
−
au
.
Combining
some
technique
reasons,
we
first
introduce
the
new
0
√
√
√ variables α0 √
2
2
2
nh a0 − a0 x0 , α nh a− au0 , β0 nh b0 − a0 x0 h, and β nh2 b − bu0 h
16 and form a new equivalent problem as follows:
, β
α
0 , β0 , α
⎧
n ⎨
1
1
Ui − u0
α0 β0 Xi − x0
arg min
αZ
β
Zi √
√
i
2
⎩
h
h
nh
nh2
i1
⎞
⎤
⎡ ⎛
p
1
2
2
⎠
⎝
⎦
⎣
a0 ξi Xi − x0 aj ηij Ui − u0 Zij εi −
2
j1
⎛
⎫
⎞
⎬
p
1
2
2
⎠
⎝
−
a0 ξi Xi − x0 aj ηij Ui − u0 Zij εi 2
⎭
j1
× K1,h Xi − x0 K2,h Ui − u0 .
3.3
6
Journal of Probability and Statistics
Let
Fn ESn ,
Rn Sn − Fn 1
Xi − x0
√
h
nh2
!
1
Ui − u0
α Zi β
K1,h Xi − x0 K2,h Ui − u0 sgnεi Zi √
h
nh2
n
i1
: Sn − Fn n
α0 β0
Lni ,
i1
3.4
where Sn is the objective function of the equality above and sgn· is the sign function.
Since the L1 estimators have no closed forms, we first give the limit form of the
function Fn , which is critical to obtain the asymptotic properties of the estimators.
Theorem 3.1. Suppose Assumptions 1–7 hold, and n → ∞, then for any fixed α0 , β0 , α, β, Fn
converges to Fα0 , β0 , α, β, which is defined as
g0 | x0 , u0 fx0 , u0 "
α20 2α0 β0 μ1 1 β02 μ2 1
α Ωα 2μ1 2α Ωβ μ2 2β Ωβ 2α0 α ω μ1 2β ω
2β0 μ1 1 α ω μ1 2β ω
− a x0 α0 μ2 1 β0 μ3 1 μ2 1α ω μ2 1μ1 2β ω
#
− cu0 α0 μ2 2ω β0 μ1 1μ2 2ω μ2 2Ωα μ3 2Ωβ .
3.5
Remark 3.2. If the kernel functions K1 ·, K2 · are symmetric about zero and Lipschitz
continuous, the limit form of Fα0 , β0 , α, β can be simplified as
F α0 , β0 , α, β
"
g0 | x0 , u0 fx0 , u0 α20 μ2 1β02 α Ωα μ2 2β Ωβ
#
2α0 α ω − μ2 1a x0 α0 α ω − μ2 2cu0 α0 ω Ωα .
3.6
Now we are in the position to state the asymptotic properties of the estimators.
Journal of Probability and Statistics
7
Theorem 3.3. Suppose Assumptions 1–7 hold and n → ∞, then one has
$
%
h2 μ22 1 − μ3 1μ1 1 a x0 d
2
,
nh a0 x0 − a0 x0 −
−→
N
0,
σ
α
0
2 μ2 1 − μ21 1
%
$
h2 μ22 2 − μ1 2μ3 2 cu0 d
nh au0 − au0 −
−→ Np 0, Σα ,
2
2 μ2 2 − μ1 2
3.7
3.8
where
σα20
⎧
⎨ 1 − ωΩ−1 ωγ0 1r 2 x0 2
⎩
4
μ21 1 μ21 1γ0 1 − 2μ1 1γ1 1 γ2 1 r22 x0 2
4g 2 0 | x0 , u0 f 2 x0 , u0 μ21 1 − μ2 1
⎫
1 − ω Ω−1 ω μ1 1 μ1 1γ0 1 − γ1 1 r22 x0 ⎬
−
,
⎭
2g0 | x0 , u0 fx0 , u0 μ21 1 − μ2 1
⎧
⎨ Ω − ωω −1 γ0 2r 2 u0 μ21 2 μ21 2γ0 2 − 2μ1 2γ1 2 γ2 2 r12 u0 Ω−1
1
Σα 2
⎩
4
4g 2 0 | x0 , u0 f 2 x0 , u0 μ21 2 − μ2 2
&
μ1 2 μ1 2γ1 2 − γ1 2 r12 u0 −
2g0 | x0 , u0 fx0 , u0 μ21 2 − μ2 2
3.9
'
'
with r12 u0 f12 ufu, u0 du and r22 x0 f22 ufx0 , udu.
Remark 3.4. If the kernel functions K1 ·, K2 · are symmetric about zero and Lipschitz
continuous, the results of Theorem 3.3 can be simplified as
$
h2 μ2 1a x0 nh a0 x0 − a0 x0 −
2
%
%
1 − ωΩ−1 ω γ0 1r22 x0 ,
−→ N 0,
4
$
d
$
%
%
$
Ω − ωω −1 γ0 2r12 u0 h2 μ2 2cu0 d
.
nh au0 − au0 −
−→ Np 0,
2
4
3.10
Remark 3.5. Here, we have considered estimation method and asymptotic distributions for
the case that two bandwidths are same. It is important to note that similar asymptotic theories
can be obtained for the case that two bandwidths with the same order are different.
Remark 3.6. If we consider different bandwidth h1 , h2 for kernel functions K1 and K2 .
Furthermore, suppose the Assumptions 2 and 7 are replaced by h51 h2 o1/n and
h1 h52 o1/n, for example, h1 ∼ n−1/6 , h2 ∼ n−1/5 , and max Zi op min{h−2
nh1 h2 },
2 ,
respectively. Similar results also will be obtained except that all the second-order derivatives
in the results will disappear.
Remark 3.7. This paper restricts the study to one-dimensional variable X. The ideas used
here can be adapted to higher dimensional variable X, for example, consider d-dimensional
8
Journal of Probability and Statistics
variable
X for the case of
√
√ same bandwidths. Similar asymptotic distribution results for
nhd ax0 − ax0 and nhau0 − au0 can be obtained under the assumptions with
Assumptions 2 and 7 being replaced by h ∼ n−1/5d and max Zi op n2/5d ,
respectively.
4. Simulations
In this section, we carry out some simulations to illustrate the performance of L1 -method,
and compare the performance of our L1 -method with that of the L2 -method. All the following
simulations are conducted for sample size n 100.
The following example is considered:
Y a0 X a1 UZ1 a2 UZ2 ε,
4.1
where a0 x x 4 exp−8x2 , a1 u 2 sin1.5πu u, a2 u 4u1 − u, X,
√U ∼
2
U−0.5, 0.5 , Z1 , Z2 are normally distributed with correlation coefficient 1/ 2, the
marginal distributions of Z1 and Z2 are standard normal, ε ∼ N0, 0.22 , and ε, X, U and
Z1 , Z2 are mutually independent.
In each simulation, the L1 estimators of a0 x, a1 u, a2 u were computed by solving
the minimization problem 2.5 and using the integrated method described in 2.6 and 2.7.
We use the Epanechnikov kernel, Ku 3/41 − u2 I|u|<1 , for every Kl ·, l 1, 2. All
bandwidths in a model are selected by the method proposed in Section 2.
To evaluate the asymptotic results given in Theorem 3.3, the quantile-quantile
plots of the estimators are constructed. Figure 1 presents the quantile-quantile plots for
a0 0, a1 0, a2 0 with sample size n 100 and 100 replications, respectively, and these plots
reveal that the asymptotic approximation is reasonable.
Figure 2 displays the true function curves of a0 x, a1 u, and a2 u and their
estimated curves with sample size n 100 and one replication. We can see from the figure
that the L1 estimates perform well.
In order to illustrate that the L1 method is a robust method. Figure 3 displays the
estimated curves with four outliers, that is, −0.4538, −0.4402, −10, 0.13377, 0.1393, 20,
0.2703, 0.3085, 20, 0.4174, 0.4398, 18 for the three element array x, u, y. From Figure 3, we
can see that the L1 estimate also has a good performance even in the presence of four large
singular points of Y . The fact that outliers have little influence on the L1 estimates is displayed
in Figure 3, so it is a robust method.
By solving the following minimization problem:
min
n
2
Yi − a0 − b0 Xi − x0 − a Zi − b Ui − u0 Zi K1,h Xi − x0 K2,h Ui − u0 ,
4.2
i1
we can obtain similarly the L2 estimators of the functions a0 x, a1 u, a2 u by the equations
2.6 and 2.7. For comparing the L1 -method with the L2 method. We simulated the function
a0 x by L2 method and display the fitted curves with without outliers for sample sizes n 100 and 1000 replications in Figure 4. We can see that L2 estimate cannot perform well for the
data sparsity and singularity, and three small outlying data points make the estimated curve
by the L2 method deviate from the true curve significantly. Combining Figures 2, 3, and 4,
Journal of Probability and Statistics
a0 (0)
2
2
1
0
−1
1
0
−1
−2
−2
−3
−3
a1 (0)
3
Sample quantiles
Sample quantiles
3
9
0
1
2
−2
−1
Standard normal quantiles
3
−3
−3
−2
a
0
1
2
−1
Standard normal quantiles
3
b
a2 (0)
Sample quantiles
3
2
1
0
−1
−2
−3
−3
0
1
2
−2
−1
Standard normal quantiles
3
c
Figure 1: Normal QQ-plot for a0 0, a1 0 and a2 0 n 100.
we conclude that the L1 method performs better than the L2 method, the L1 method is a robust
method.
Finally, for further comparing the L1 -estimate with the L2 -estimate method, we also
assess their performance via the weighted average squared error WASE, which is defined
as
$
%
n
1
a0 xi − a0 xi 2 a1 ui − a1 ui 2 a2 ui − a2 ui 2
,
WASE 2 2 2
n i1
rangea0 rangea1 rangea2 4.3
where range ak for k 0, 1, 2 are the ranges of the functions a0 x, a1 u, and a2 u. The
weights are introduced to account for the different scales of the functions. We conducted 200
replications with sample size n 100. For the bandwidths and the Epanechnikov kernels
used in the simulations, we obtain the mean and standard deviation of the WASE are 0.1201
and 0.0183 for the L1 method, and 1.6613 and 0.8936 for the L2 method. We can see that L1
method outperforms L2 method.
10
Journal of Probability and Statistics
4.5
2
1
4
1.5
0.5
1
0
0.5
−0.5
3.5
2
a2 (u)
2.5
a1 (u)
a0 (x)
3
0
−1
−0.5
−1.5
−1
−2
−1.5
−2.5
1.5
1
0.5
0
−0.5
0
x
−2
−0.5
0.5
a
0
u
−3
−0.5
0.5
b
0
u
0.5
c
2
1
3.5
1.5
0.5
3
1
0
2.5
0.5
−0.5
2
a2 (u)
4
a1 (u)
a0 (x)
Figure 2: Simulation results, for example. Dotted curves are L1 estimators; solid curve are the true curves.
0
−1
1.5
−0.5
−1.5
1
−1
−2
0.5
−1.5
−2.5
0
−0.5
0
0.5
−2
−0.5
0.5
−3
−0.5
u
u
a
0
b
0
0.5
u
c
Figure 3: Simulation results, for example, with outliers. Dotted curves are L1 estimators; solid curve are
the true curves.
Journal of Probability and Statistics
11
6
4
4
3
2
a0 (x)
a0 (x)
(N)
5
2
0
1
−2
0
−4
−1
−0.5
0
(Y)
−6
−0.5
0.5
0
0.5
u
u
a
b
Figure 4: Simulation results for a0 x. Dotted curves are L2 estimators; solid curve are the true curves. N
plot without outliers, Y plot with three outliers x, y −0.4538, −5, 0.1948, 8, and 0.4649, 5.
5. Application
A real data is analyzed by the proposed L1 -method in this section. The classic gas furnace
data was studied recently by Wong et al. 1. The data set includes 296 samples It , Ot , t 1, . . . , 296 measured at a fixed interval of 9 seconds, where It ’s represent the input gas rate in
cubic feet per minute, and Ot ’s represent the concentration of carbon dioxide in the gas out
of the furnace. Similar to the procedures of Wong et al. 1, the original data are transformed
as xt It 2.716/5.55, yt Ot − 45.6/14.9 for t 1, . . . , 296 such that both x’s and y’s are
limited in the interval 0, 1, and the model
yt a0 xt−3 4
aj yt−3 yt−j t
5.1
j1
is used to fit the data. The first 250 samples are used to establish the model, and the remained
46 samples are used for prediction.
In the proposed method, Epanechnikov kernel is used and all the bandwidths hj j 0, . . . , 4 are selected as 0.14 via cross validation for simplicity. The mean absolute error MAE
and the mean squared error MSE for fitting and forecasting are listed as follows.
Fitting: MAE 0.3189, MSE 0.1584; forecasting: MAE 0.3705, MSE 0.2127.
Since the model is chosen based on the L2 errors, the results are not perfect as that in
Wong et al. 1. Moreover, the data set does not contain obvious outliers, so the advantage
of the L1 estimation method is not apparent. Compared to the results showed in Wong et al.
1, the difference between the fitting MAE/MSE and the forecasting MAE/MSE is small.
12
Journal of Probability and Statistics
62
60
60
59
58
57
56
CO2 concentration
CO2 concentration
58
54
52
50
56
55
54
53
48
52
46
51
44
0
50
100
150
Input gas rate
200
250
50
250
260
a
270
280
Input gas rate
290
300
b
Figure 5: a Fitted values of Ot t 5, . . . , 250 for gas data; b Predictive values Ot t 251, . . . , 296 for
gas data. Dotted lines are fitted or predicted; solid lines are observed.
250
t }t5 and
The reason is that L1 -methods are employed in our method. The fitted values {O
296
t }t251 are shown in Figure 5. These results indicate that the estimated
predictive values {O
results are reasonable.
6. Proofs of the Main Results
Before completing the proofs of the main results, we give the following useful lemma first.
Lemma 6.1. Suppose Assumptions 1–7 hold, then for any fixed α0 , β0 , α, β, Rn converges to 0 in
probability, that is, Rn op 1 as n → ∞.
Proof of Lemma 6.1. Let
⎧
⎨
1
Ui − u0
Xi − x0
1 dn max α0 β0
Zi √
αZi β
√
1≤i≤n ⎩
h
h
nh2
nh2
⎫
⎬
p
1 2
2
a0 ξi Xi − x0 aj ηij Ui − u0 Zij .
2
⎭
j1
6.1
Note that |Xi − x0 | ≤ Ch, |Ui − u0 | ≤ Ch and Assumptions 2, 5, and 7, for any fixed α0 , β0 ,
α and β, we have that
dn op 1
as n −→ ∞.
6.2
Journal of Probability and Statistics
13
Let
⎧
⎨
1
Ui − u0
Xi − x0
1
Zi √
αZi β
Tni Lni α0 β0
√
⎩
h
h
nh2
nh2
⎞
⎤
⎡ ⎛
p
1
− ⎣ ⎝a0 ξi Xi − x0 2 aj ηij Ui − u0 2 Zij ⎠ εi ⎦
2
j1
⎛
⎫
⎞
⎬
p
1
2
2
⎠
⎝
−
a0 ξi Xi − x0 aj ηij Ui − u0 Zij εi 2
⎭
j1
6.3
× K1,h Xi − x0 K2,h Ui − u0 .
It can be easily seen that Tni 0 if |εi | ≥ dn . Hence we have
!
1
Xi − x0
Ui − u0
α Zi β
Zi √
|Tni | ≤ 2 α0 β0
h
h
nh2
6.4
× K1,h Xi − x0 K2,h Ui − u0 I{|εi |<dn } ,
where I· is the indicator function. For any > 0, δ > 0, we have
(
)
P {|Rn | > } ≤ P {dn ≥ δ} P I{dn <δ} |Rn | > .
6.5
Since Esgnεi 0, we have
E I{dn <δ} R2n
⎧
%2 ⎫
$
n
⎨
⎬
"
#
E I{dn <δ} Sn − Fn Ln 2 E I{dn <δ}
,
Tni − ETni ⎩
⎭
i1
6.6
14
Journal of Probability and Statistics
here Ln lim E
n→∞
*n
i1
Lni . Combining 6.2 and 6.4, we have
I{dn <δ} R2n
⎧
⎨
+
n
lim E I{dn <δ}
E Tni − ETni 2 | X, U, Z
n→∞ ⎩
i1
⎤⎫
⎬
E Tni − ETni Tnj − E Tnj | X, U, Z ⎦
⎭
ij
/
⎧
+
⎪
2 2 ⎨ Xi − x0
Ui − u0
≤ 8 lim E
α0 β0
Zi
α Zi β
n→∞ ⎪
h
h
⎩
×
⎫
⎪
⎬
1 2
2
K Xi − x0 K2,h
Ui − u0 P {|εi | ≤ δ | X, U}
⎪
h2 1,h
⎭
.
≤ 16δg0 | x0 , u0 fx0 , u0 γ0 2 α20 γ0 1 2α0 β0 γ1 1 β02 γ2 1
γ0 1 γ0 2α Ωα
/
2γ2 1α Ωβ γ2 2β Ωβ o1
−→ 0,
6.7
when δ → 0. Combining 6.2, 6.5, and the argument EI{dn <δ} R2n → 0 as δ → 0 and
n → ∞, the desired conclusion follows by using the Chebyshev’s inequality. This completes
the proof of Lemma 6.1.
Proof of Theorem 3.1. Set
Ui − u0
1
1
α Zi β
,
Zi √
√
2
h
nh
nh2
⎞
⎛
p
1 ⎝ BXi , Ui , Zi a0 ξi Xi − x0 2 aj ηij Ui − u0 2 Zij ⎠.
2
j1
AXi , Ui , Zi α0 β0
Xi − x0
h
6.8
By Lemma 6.1, under Assumptions 1 − 7, AXi , Ui , Zi and BXi , Ui , Zi converge to zero
as n → ∞.
Start from the equality,
Fn I{dn <δ} Fn I{dn ≥δ} Fn .
6.9
Journal of Probability and Statistics
15
We first give the limit form of Fn , for fixed α0 , β0 , α, β, we have
I{dn <δ} Fn
&
0
n
I{dn <δ} E
|AXi , Ui , Zi − BXi , Ui , Zi − εi | − |BXi , Ui , Zi εi |
i1
× K1,h Xi − x0 K2,h Ui − u0 I{dn <δ} nE{|AX1 , U1 , Z1 − BX1 , U1 , Z1 − ε1 | − |BX1 , U1 , Z1 ε1 |}
× K1,h X1 − x0 K2,h U1 − u0 I{dn <δ} nE{E{|AX1 , U1 , Z1 − BX1 , U1 , Z1 − ε1 | − |BX1 , U1 , Z1 ε1 |
×K1,h X1 − x0 K2,h U1 − u0 | X, U, Z}}
0
I{dn <δ} nE
wBX1 , U1 , Z1 −AX1 , U1 , Z1 gw | X1 , U1 dw
δ>w>AX1 ,U1 ,Z1 −BX1 ,U1 ,Z1 AX1 , U1 , Z1 −BX1 , U1 , Z1 −wgw | X1 , U1 dw
−δ<w<AX1 ,U1 ,Z1 −BX1 ,U1 ,Z1 −
δ>w>−BX1 ,U1 ,Z1 w BX1 , U1 , Z1 gw | X1 , U1 dw
−δ<w<−BX1 ,U1 ,Z1 BX1 , U1 , Z1 wgw | X1 , U1 dw
−gw | X1 , U1 AX1 , U1 , Z1 dw
w>δ
!
gw | X1 , U1 AX1 , U1 , Z1 dw K1,h X1 − x0 K2,h U1 − u0 .
w<−δ
6.10
By the Integral Mean Value Theorem refer to the Appendix, we have
I{dn <δ} Fn
0
nI{dn <δ} E gτ1 | X1 , U1 +
δ2
BX1 , U1 , Z1 − AX1 , U1 , Z1 2
BX1 , U1 , Z1 −AX1 , U1 , Z1 δ
×
2
2
gτ2 | X1 , U1 -
16
Journal of Probability and Statistics
+
δ2
BX1 , U1 , Z1 − AX1 , U1 , Z1 2
AX1 , U1 , Z1 −BX1 , U1 , Z1 δ
×
2
2
AX1 , U1 , Z1 Gδ | X1 , U1 G−δ | X1 , U1 − 1
+
δ2
B2 X1 , U1 , Z1 BX1 , U1 , Z1 δ − gτ3 | X1 , U1 2
2
+
-&
δ2
B2 X1 , U1 , Z1 − BX1 , U1 , Z1 δ − gτ4 | X1 , U1 2
2
× K1,h X1 − x0 K2,h U1 − u0 ,
6.11
where G·|· is the conditional probability distribution function of ε, and τ1 , τ2 , τ3 , τ4 converge
to zero as δ → 0 and n → ∞. Then by Assumptions 1–5, for any small enough δ > 0, we
have
I{dn <δ} Fn I{dn <δ} nE
"
g0 | X1 , U1 o1
A2 X1 , U1 , Z1 − 2AX1 , U1 , Z1 BX1 , U1 , Z1 #
× K1,h X1 − x0 K2,h U1 − u0 I{dn <δ} g0 | x0 , u0 fx0 , u0 ×
"
α20 2α0 β0 μ1 1 β02 μ2 1
α Ωα 2μ1 2α Ωβ μ2 2β Ωβ
2α0 α ω μ1 2β ω 2β0 μ1 1 α ω μ1 2β ω
− a x0 α0 μ2 1 β0 μ3 1 μ2 1α ω μ2 1μ1 2β ω
− cu0 α0 μ2 2ω β0 μ1 1μ2 2ω
#
μ2 2Ωα μ3 2Ωβ o1
I{dn <δ} F α0 , β0 , α, β o1 ,
Fn F α0 , β0 , α, β o1 − F α0 , β0 , α, β o1 − Fn I{dn ≥δ} .
6.12
Note that, for any fixed α0 , β0 , α, β,dn → 0 as n → ∞, we draw the desired conclusion. This
completes the proof of Theorem 3.1.
Proof of Theorem 3.3. Since the proofs of 3.7 and 3.8 are quite similar, we only give the
proof of 3.8.
Journal of Probability and Statistics
17
Start from the equality
Ln !
n
Xi − x0
Ui − u0
1
α Zi β
Zi √
α0 β0
h
h
nh2
i1
6.13
× K1,h Xi − x0 K2,h Ui − u0 sgnεi .
By Lemma 6.1 and Theorem 3.1, for fixed α0 , β0 , α, β, we have
Sn F α0 , β0 , α, β − Ln Rn o1 F α0 , β0 , α, β − Ln Rn ,
6.14
Sn Ln F α0 , β0 , α, β Rn .
6.15
hence
Note that
EL2n n
EL2ni
i1
≤ 2nE
0
α0 β0
Xi − x0
h
2
1
2
K 2 Xi − x0 K2,h
Ui − u0 nh2 1,h
&
2
1
Ui − u0
2
2
Zi
K Xi − x0 K2,h Ui − u0 α Zi β
h
nh2 1,h
.
/
2 γ0 2 α20 γ0 1 2α0 β0 γ1 1 2β02 γ2 1 γ0 1 α Ωα 2γ1 1α Ωβ γ2 2β Ωβ .
6.16
We obtain that Ln is bounded in probability for any fixed α0 , β0 , α, β. Thus, for any fixed
α0 , β0 , α, β, the random convex function Sn Ln converges to Fα0 , β0 , α, β. According to the
convexity lemma 17, we can deduce that, for any compact set K,
sup
Rn op 1,
α0 ,β0 ,α ,β ∈K
6.17
when δ → 0. By the proof of Theorem 2 in Wang 18, we obtain that the “limit” here is not
only in the sense of the limit of a sequence of random variables, but also in the sense of the
limit of a sequence of stochastic process, and the minimizer of Sn converges to the minimizers
of Fα0 , β0 , α, β − Ln .
18
Journal of Probability and Statistics
By the convexity of the function Fα0 , β0 , α, β, we have
√ 2 *n 1/ nh K1,h Xi − x0 K2,h Ui − u0 sgnεi −
x
−
μ
1
X
/h
1
i
0
i1
β0 2g0 | x0 , u0 fx0 , u0 μ21 1 − μ2 1
μ2 1μ1 1 − μ3 1 a x0 ,
2 μ21 1 − μ2 1
√
* Ω−1 ni1 μ1 2 − Ui − u0 /h Zi / nh2 K1,h Xi − x0 K2,h Ui − u0 sgnεi β 2g0 | x0 , u0 fx0 , u0 μ21 2 − μ2 2
μ2 2μ1 2 − μ3 2 cu0 ,
2 μ21 2 − μ2 2
2
μ 2 − μ1 2μ3 2 cu0 α
2 2 μ2 2 − μ21 2
n
−1
1 √
Ω − ωω Zi − ωK1,h Xi − x0 K2,h Ui − u0 sgnεi 2 nh2 i1
√
* μ1 2Ω−1 ni1 μ1 2 − Ui − u0 /h Zi 1/ nh2 K1,h Xi − x0 K2,h Ui − u0 sgnεi −
2g0 | x0 , u0 fx0 , u0 μ21 2 − μ2 2
2
n μ2 1 − μ3 1μ1 1 a x0 1 −1
1
−
ωΩ
K1,h Xi − x0 K2,h Ui − u0 sgnεi α0 Z
√
i
2 μ2 1 − μ21 1
2 nh2 i1
√
* μ1 1 ni1 μ1 1 − Xi − x0 /h 1/ nh2 K1,h Xi − x0 K2,h Ui − u0 sgnεi −
.
2g0 | x0 , u0 fx0 , u0 μ21 1 − μ2 1
6.18
Thus we obtain
n
α
u0 1 α
Xk , u0 √ √
h
n h k1
n
n −1
1
Ω − ωω Zi − ωK1,h Xi − Xk K2,h Ui − u0 sgnεi √
2nh nh k1 i1
* * μ1 2Ω−1 nk1 ni1 μ1 2 − Ui − u0 /h Zi K1,h Xi − Xk K2,h Ui − u0 sgnεi −
√
2nh nhg0 | x0 , u0 fx0 , u0 μ21 2 − μ2 2
2
μ 2 − μ1 2μ3 2 cu0 2 √ 2 h μ2 2 − μ21 2
2
μ2 2 − μ1 2μ3 2 cu0 : A1 √ .
2 h μ2 2 − μ21 2
6.19
Journal of Probability and Statistics
19
By interchanging summation signs and noting that
n
(
)
1
K1,h Xi − Xk hf1 Xi 1 op 1
n k1
6.20
A1 can be rewritten as
n
(
)
−1
1 A1 √
Ω − ωω Zi − ωK2,h Ui − u0 f1 Xi sgnεi 1 op 1
2 nh i1
(
)
* μ1 2Ω−1 ni1 μ1 2 − Ui − u0 /h Zi K2,h Ui − u0 f1 Xi sgnεi 1 op 1
.
−
√
2 nhg0 | x0 , u0 fx0 , u0 μ21 2 − μ2 2
6.21
For all t ∈ Rp , by some straightforward computations, we get that the mean and variance of
t A1 are
E t A1 0
2
E tA1 0
t
Ω − ωω −1 γ0 2r22 u0 4
μ21 2 μ21 2γ0 2 − 2μ1 2γ1 2 γ2 2 r12 u0 Ω−1
2
4g 2 0 | x0 , u0 f 2 x0 , u0 μ21 2 − μ2 2
&
μ1 2 μ1 2γ1 2 − γ1 2 r12 u0 (
)
−
2
t 1 op 1 .
2g0 | x0 , u0 fx0 , u0 μ1 2 − μ2 2
6.22
By Assumptions 1–7, and using the methods used in the proof of Theorem 1 in Tang
and Wang 19, we can easily check that the Lindeberg-Feller’s condition for t A1 is held. So
Cramer-Wold theorem tells us 3.8 follows. This completes the proof of Theorem 3.3.
Appendix
Integral Mean Value Theorem
If f and g are continuous functions defined on a, b, and gx ≥ 0 or gx ≤ 0, then there
exists a number ξ ∈ a, b such that
b
a
fxgxdx fξ
b
gxdx.
a
A.1
20
Journal of Probability and Statistics
Acknowledgments
The authors gratefully acknowledge the helpful recommendations from the referees that led
to an improved revision of an earlier paper.
References
1 H. Wong, R. Zhang, W.-c. Ip, and G. Li, “Functional-coefficient partially linear regression model,”
Journal of Multivariate Analysis, vol. 99, no. 2, pp. 278–305, 2008.
2 H. Zhu, P. J. Brown, and J. S. Morris, “Robust, adaptive functional regression in functional mixed
model framework,” Journal of the American Statistical Association, vol. 106, no. 495, pp. 1167–1179, 2011.
3 G. Aneiros-Pérez and P. Vieu, “Nonparametric time series prediction: a semi-functional partial linear
modeling,” Journal of Multivariate Analysis, vol. 99, no. 5, pp. 834–857, 2008.
4 I. Ahmad, S. Leelahanon, and Q. Li, “Efficient estimation of a semiparametric partially linear varying
coefficient model,” The Annals of Statistics, vol. 33, no. 1, pp. 258–283, 2005.
5 W. S. Cleveland, E. Grosse, and W. M. Shyu, “Local regression models,” in Statistical Models in S, J. M.
Chambers and T. J. Hastie, Eds., pp. 309–376, Wadsworth and Brooks, Pacific Grove, Calif, USA, 1992.
6 D. R. Hoover, J. A. Rice, C. O. Wu, and L.-P. Yang, “Nonparametric smoothing estimates of timevarying coefficient models with longitudinal data,” Biometrika, vol. 85, no. 4, pp. 809–822, 1998.
7 C. O. Wu, C.-T. Chiang, and D. R. Hoover, “Asymptotic confidence regions for kernel smoothing of
a varying-coefficient model with longitudinal data,” Journal of the American Statistical Association, vol.
93, no. 444, pp. 1388–1402, 1998.
8 T. Hastie and R. Tibshirani, “Varying-coefficient models,” Journal of the Royal Statistical Society Series
B, vol. 55, no. 4, pp. 757–796, 1993, With discussion and a reply by the authors.
9 C.-T. Chiang, J. A. Rice, and C. O. Wu, “Smoothing spline estimation for varying coefficient models
with repeatedly measured dependent variables,” Journal of the American Statistical Association, vol. 96,
no. 454, pp. 605–619, 2001.
10 J. Fan and W. Zhang, “Statistical estimation in varying coefficient models,” The Annals of Statistics,
vol. 27, no. 5, pp. 1491–1518, 1999.
11 J. Fan and J.-T. Zhang, “Two-step estimation of functional linear models with applications to
longitudinal data,” Journal of the Royal Statistical Society Series B, vol. 62, no. 2, pp. 303–322, 2000.
12 J. Fan and I. Gijbels, Local Polynomial Modelling and Its Applications, vol. 66 of Monographs on Statistics
and Applied Probability, Chapman & Hall, London, UK, 1996.
13 F. T. Wang and D. W. Scott, “The L1 method for robust nonparametric regression,” Journal of the
American Statistical Association, vol. 89, no. 425, pp. 65–76, 1994.
14 M. S. Bazaraa and C. M. Shetty, Nonlinear Programming, John Wiley & Sons , New York, NY, USA,
1979, Theory and Algorithms.
15 J. A. Rice and B. W. Silverman, “Estimating the mean and covariance structure nonparametrically
when the data are curves,” Journal of the Royal Statistical Society Series B, vol. 53, no. 1, pp. 233–243,
1991.
16 B. L. S. Prakasa Rao, Asymptotic Theory of Statistical Inference, John Wiley & Sons, New York, NY, USA,
1987.
17 D. Pollard, “Asymptotics for least absolute deviation regression estimators,” Econometric Theory, vol.
7, no. 2, pp. 186–199, 1991.
18 J. D. Wang, “Asymptotic normality of L1 -estimators in nonlinear regression,” Journal of Multivariate
Analysis, vol. 54, no. 2, pp. 227–238, 1995.
19 Q. Tang and J. Wang, “L1 -estimation for varying coefficient models,” Statistics, vol. 39, no. 5, pp. 389–
404, 2005.
Advances in
Operations Research
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Advances in
Decision Sciences
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Mathematical Problems
in Engineering
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Journal of
Algebra
Hindawi Publishing Corporation
http://www.hindawi.com
Probability and Statistics
Volume 2014
The Scientific
World Journal
Hindawi Publishing Corporation
http://www.hindawi.com
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
International Journal of
Differential Equations
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Volume 2014
Submit your manuscripts at
http://www.hindawi.com
International Journal of
Advances in
Combinatorics
Hindawi Publishing Corporation
http://www.hindawi.com
Mathematical Physics
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Journal of
Complex Analysis
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
International
Journal of
Mathematics and
Mathematical
Sciences
Journal of
Hindawi Publishing Corporation
http://www.hindawi.com
Stochastic Analysis
Abstract and
Applied Analysis
Hindawi Publishing Corporation
http://www.hindawi.com
Hindawi Publishing Corporation
http://www.hindawi.com
International Journal of
Mathematics
Volume 2014
Volume 2014
Discrete Dynamics in
Nature and Society
Volume 2014
Volume 2014
Journal of
Journal of
Discrete Mathematics
Journal of
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Applied Mathematics
Journal of
Function Spaces
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Optimization
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Download