Asymptotic properties in a semi-functional partial linear regression model Germ´an Aneiros-P´erez

advertisement
Asymptotic properties in a semi-functional
partial linear regression model
Germán Aneiros-Pérez1 and Philippe Vieu2
1
2
Departamento de Matemáticas, Facultad de Informática, Universidade da
Coruña, Campus de Elviña s/n, 15071 A Coruña, Spain ganeiros@udc.es
Laboratoire de Statistique et Probabilités, Université Paul Sabatier, Toulouse3,
118 route de Narbonne, 31062 Toulouse Cedex, France vieu@cict.fr
Summary. A new regression model is introduced in order to capture both the
advantages of a semi-linear modelling (see [AGV04]) and those of the recent advances
on nonparametric statistics for functional data (see [FV06]). This leads us to the
following so-called Semi-Functional Partial Linear Regression model:
Y = r (X1 , ..., Xp , T ) + ε =
p
X
Xj βj + m(T ) + ε,
(1)
j=1
where Xj and T are real and functional explanatory variables, respectively. Estimates for the vector of parameters β and the function m in (2) are presented and
some asymptotic
results
are given. Specifically, we obtain:
√ b
n βh − β −→N 0, σε2 B−1 ,
(i)
(ii) lim supn→∞ (n/(2 log log n))1/2 βbhj − βj = σε2 bjj
α
p
1/2
a.s., Finally, a real
b h (t) − m(t)| = O (h ) + O
log n/(nφ (h)) a.s.
(iii) supt∈C |m
data example illustrates the usefulness of the model.
1 Introduction
Since the introductory work by [EGRW86], the partial linear model has been
widely studied (see [S88], [C88], [S96], [AGV04], and references therein) and
its interest has been emphasized in many fields of applied statistics. The aim
of such a model is to allow to some among the explanatory variables to act
in a free nonparametric manner, while some other are controlled by means of
a parametric (linear) relation. Until now, such kind of model has only been
investigated in the situation where all the explanatory variables take real
values. See [HLG00] for a monograph on these models.
From another part, because of their increasing interest at the present moment in many fields of applied statistics for which statistical observations
are curves, functional data are the object of the attention of many researches.
1652
Germán Aneiros-Pérez and Philippe Vieu
The reader can have access to the state of art in parametric modelling (respectively in nonparametric modelling) for functional data by looking at [RS02]
and [RS05] (respectively by looking at [FV06]). In the setting of functional
regression problems, most recent advances are those by [CFS03] (concerning
parametric linear models) and those by [FV06] (concerning nonparametric
approaches).
The aim of our note is to combine the flexibility of a partial linear modelling together with the recent methodology for nonparametric treatment of
functional data. This leads us to the following so-called Semi-Functional Partial Linear Regression model (SFPLR model)
Y = r (X1 , ..., Xp , T ) + ε =
p
X
Xj βj + m(T ) + ε,
(2)
j=1
where Xj (j = 1, ..., p) are real explanatory variables, T is another explanatory
variable but of functional nature, ε is a random error satisfying
E (ε | X1 , ..., Xp , T ) = 0,
T
β = (β1 , ..., βp ) is a vector of unknown real parameters and m is an unknown
smooth real function. According to [FV06], the most interesting spaces to
model functional variables are semi-metric spaces. With other words, we consider that T is valued in some abstract semi-metric space H and we denote by
d (·, ·) the associated semi-metric. In the rest of the paper all the used topological notions derive from the topology Td associated with this semi-metric.
In this note, we will study the SFPLR model. More precisely, in Section 2
we construct estimates based on a sample of independent and identically distributed vectors, while in Section 3 we derive their first asymptotic properties.
Finally, Section 4 is devoted to the illustration of how our general methodology
applies to some spectrometric data set.
2 The model and the estimates
Assume that we have a sample of n independent and identically distributed
vectors valued in Rp+1 × C (C ⊂ H). These vectors will be denoted from now
on by
n
{(Yi , Xi1 , ..., Xip , Ti )}i=1 .
The SFPLR model can be rewritten by assuming that we have
Yi =
p
X
j=1
where
Xij βj + m(Ti ) + εi
(i = 1, . . . , n),
(3)
Semi-functional partial linear regression
E (εi | Xi1 , ..., Xip , Ti ) = 0 (i = 1, . . . , n).
1653
(4)
For technical reasons, we will assume that
n
C is some given compact subset of H such that C ⊂ ∪τk=1
B (zk , ln ) ,
(5)
where τn lnγ = C (γ and C denote real positive constants), and τn → ∞ and
ln → 0 as n → ∞ (we have denoted B (t, h) = {t′ ∈ H; d (t′ , t) < h}). The
compactness of C is an usual condition in the setting of nonfunctional partial
linear models (see, e.g., [C88], [BZ97], and [L00]), while conditions on τn and ln
are usual ones in the setting of functional nonparametric models (see [FV06],
Section 9.7).
We estimate the vector of parameters β and the function m in (2) by
means of
−1
eTY
e
eTX
e
βbh = X
X
(6)
h h
h h
and
m
b h (t) =
n
X
i=1
wn,h (t, Ti )(Yi − XTi βbh ),
(7)
respectively. In these estimators, h is a smoothing parameter that typically
appears in any setting of nonparametric estimation. Furthermore, we have deT
T
T
noted X = (X1 , ..., Xn ) with Xi = (Xi1 , ..., Xip ) , Y = (Y1 , ..., Yn ) and, for
e h = (I − W )A, where Wh = (wn,h (Ti , Tj ))
any (n × q)-matrix A (q ≥ 1), A
h
i,j
with wn,h (·, ·) being a weight function that can take different forms. Concretely, in this paper we will focus on the weights
K (d (t, Ti ) /h)
wn,h (t, Ti ) = Pn
,
j=1 K (d (t, Tj ) /h)
(8)
where K is a function from [0, ∞) into [0, ∞). These weights, used in [FV06]
for a purely nonparametric regression model, are a functional version of the
Nadaraya-Watson type weights.
As it is usual in nonfunctional partial linear regression models, the conditions linked with the estimation of the nonparametric component m are exactly the same as those used in pure nonparametric regression models. Therefore, we will naturally need the same set of assumptions as those originally
proposed in [FV06]. This concerns as well the kernel function K, which is
assumed to satisfy the following usual restrictions:
K has support [0, 1] is Lipschitz continuous on [0, ∞),
and
∃θ such that ∀u ∈ [0, 1], − K ′ (u) > θ > 0,
(9)
as well the probability distribution of the infinite-dimensional process T , which
is assumed such that there exist a positive-valued function φ on (0, ∞) and
positive constants α0 , α1 and α2 such that
1654
Z1
0
Germán Aneiros-Pérez and Philippe Vieu
φ (hs) ds > α0 φ (h) and α1 φ (h) ≤ P (T ∈ B (t, h)) ≤ α2 φ (h) , ∀ t ∈ C, h > 0.
(10)
The reader will find in [FV06] a discussion concerning the links between these
assumptions, the semi-metric d and the small ball concentration properties of
T , as well as discussion about how the small ball probability condition (10)
can be interpreted in finite dimensional setting in terms of standard conditions
on the density of T . The conditions on the smoothing parameter h > 0 are
standard and will be stated along the theorems below.
While these conditions are those needed to deal with the functional nonparametric component of the model, there is naturally a second set of conditions which is linked with the linear part of the model. For that, let us
introduce the following notations:
T
gj (t) = E (Xij | Ti = t) , ηij = Xij − E (Xij | Ti ) and ηi = (ηi1 , ..., ηip ) .
Observe that the expressions of our estimators (6) and (7) contain estimators
of g1 , ..., gp . So, in addition to the usual smoothness conditions on m we need
similar ones on the gj . More precisely, we assume that all the operators to be
estimated are smooth, in the sense that for some C < ∞ and some α > 0 we
have
α
(11)
E |ε1 | + E |η11 | + · · · + E |η1p | < ∞, where r ≥ 3,
σε2 = V ar (ε) > 0 and B = E η1 η1T is a positive definite matrix,
(12)
∀ (u, v) ∈ C × C, ∀ f ∈ {m, g1 , ..., gp } , |f (u) − f (v)| ≤ Cd (u, v) .
Furthermore, we need the following assumptions:
r
r
r
(13)
and
ηi is independent of εi (i = 1, ..., n).
(14)
Observe that assumptions (11)-(14) are not unduly restrictive, and they are
quite usual in the setting of nonfunctional partial linear models (see [RB90],
[G95], and [L00], among others).
3 Asymptotic behaviour
We are now in position to give our asymptotic results. Theorem 1 studies
the asymptotic behaviour of the estimate of the parametric component of the
model, while Theorem 2 concerns the nonparametric one.
Semi-functional partial linear regression
1655
Theorem 1. Under assumptions (3)-(5) and (8)-(14), if in addition nh4α →
2
0 as n → ∞ and φ (h) ≥ n(2/r)+b−1 / (log n) for n large enough and some constant b > 0 satisfying (2/r) + b > 1/2 (where r ≥ 3 was defined in assumption
(12)), then √
(i) n βbh − β −→N 0, σε2 B−1 ,
1/2
1/2 a.s.,
(ii) lim supn→∞ (n/(2 log log n)) βbhj − βj = σε2 bjj
jj
−1
where b = B
.
jj
This theorem extends previous results established in the nonfunctional setting (see [C88], and [G95], among others) to the case where the explanatory
variable T is of functional nature. We see that the dimension of T does not
change the rate of convergence of βbh , but it modifies the conditions on the
smoothing parameter h.
Theorem 2. Under the assumptions of Theorem 1, we have that
p
sup |m
b h (t) − m(t)| = O (hα ) + O
log n/(nφ (h)) a.s.
t∈C
This result can be seen as an extension, in several directions, of existing literature. Firstly, observe that it is an extension to the results existing in pure
nonparametric functional models (see [FV06], and references therein). The
rates of convergence are similar, showing that (as it was previously the case
in nonfunctional partial linear models) the existence of a linear component
does not change the rates of convergence of the nonparametric component.
At this point it is worth noting that, even if the main novelty of our methodology is to consider functional situations (that is, situations where the space
H is of infinite dimension), all our results apply directly to the special case
where H = Rq . To fix the ideas, let us just mention that if T takes values in
C ⊂ Rq and if T has a strictly positive density (on its support) with respect
to Lebesgue measure, then we can take as function φ(h) ∼ hq . This means
that, if we particularize our results to the finite dimensional case, the rates of
convergence
p of the nonparametric component go back to be the classical ones
of order log n/nhq .
4 A real example
In this section, we present an application of the SFPLR model to spectrometric
curves. The aim of this application is not to achieve a full case study but
to show the interest of the three ideas in our model, that is: (i) functional
nonparametric part, (ii) additional information with real explanatory variables
and (iii) linearity of the effect of the real explanatory variables.
Each food sample contains finely chopped pure meat with different fat,
protein and moisture (water) contents. For each food sample, the functional
1656
Germán Aneiros-Pérez and Philippe Vieu
data consists of a 100 channel spectrum of absorbances recorded on a Tecator Infratec Food and Feed Analyzer working in the wavelength range 850 1050 nm by the Near Infrared Transmission (NIT) principle. The fat, protein and moisture contents, measured in percent, are determined by analytic
chemistry. The aim is to find the relationship between the percentage of fat
content Y , and the corresponding percentages of protein content X1 and moisture content X2 , and the spectrometric curve T . More details on the data
can be found in [FV06]. Finally, we had n = 215 independent observations
{(Yi , Xi1 , Xi2 , Ti )}ni=1 of (Y, X1 , X2 , T ). This sample was divided into two data
sets: the training sample I = {1, ..., 165} was used to select some parameters
of the estimates and the testing sample J = {166, ..., 215} allowed to verify
the quality of prediction.
We have used various different models to predict the fat content of a meat
sample on the basis of its protein and/or moisture contents and/or its NIT
absorbance spectrum. Concerning the functional features of the models, and
according to [FV06], we used the semi-metrics ds (·, ·) =k · − · ks , s = 0, 1, 2, 3,
R
2 1/2
where k f ks =
f (s) (t) dt
and we used k-nearest neighbours type
bandwidths. Both parameters s and k were selected by cross-validation over
the training sample I. Furthermore, for linear and additive features of the
models (non- and semi-functional), OLS estimators and a backfitting algorithm (see [HT90]) were used, respectively.
The criteria used on the test sample J to compare the skill of the different
models was the following mean quadratic error of prediction
2
X
1
Yj − Ybj /V arJ (Y ) .
CardJ
(15)
j∈J
The different models used and the corresponding values of this criterion error
are shown in Table 1 below.
The first interesting thing to be noted is the strong linear relationship between
the fat content and the protein and moisture contents, the corresponding linear
model giving a similar information (in terms of the mean error of prediction)
as that of the SFPLR model containing the moisture content and the spectrometric curve. More interestingly, if one mix these two models (by mean of
a SFPLR model with protein and moisture as real explanatory variables and
spectrometric curve as functional one), then the mean error of prediction is
reduced in a 50%. The rest of the studied models are clearly worse than the
three models just mentioned.
As a conclusion, we would say that this spectrometric application has
shown the interest of the three points of the model (see (i)-(iii) at the start of
this section), these data being charaterized by their functional nonparametric
structure and the linear effect of exogenous variables. The SFPLR model is a
competitive one for such data.
Semi-functional partial linear regression
1657
Table 1. Models and mean value of the criterion error for the test sample
Mean error
of prediction
0.2296
0.0232
0.0111
Linear models
Y = α1 + X1 β1 + ε1
Y = α2 + X2 β2 + ε2
Y = α3 + X1 β3,1 + X2 β3,2 + ε3
Nonparametric models
Y = m1 (X1 ) + ε4
Y = m2 (X2 ) + ε5
0.3761
0.0256
Additive model
Y = µ + m3 (X1 ) + m4 (X2 ) + ε6
0.0317
Optimal
semi-metric
2
0.0233
Semi-Functional Partial Linear models
Y = X1 β4 + m6 (T ) + ε8
Y = X2 β5 + m7 (T ) + ε9
Y = X1 β6,1 + X2 β6,2 + m8 (T ) + ε10
2
1
1
0.0223
0.0114
0.0052
Additive Semi-Functional models
Y = µ + m9 (X1 ) + m10 (T ) + ε11
Y = µ + m11 (X2 ) + m12 (T ) + ε12
Y = µ + m13 (X1 ) + m14 (X2 ) + m15 (T ) + ε13
2
2
2
0.0242
0.0368
0.0395
Functional model
Y = m5 (T ) + ε7
Acknowledgements. Research of the first author was supported in part
by MEC Grant (EU ERDF support included) MTM2005-00429. Philippe Vieu
wishes to thank all the participants of the working group STAPH on Functional Statistics at the University Paul Sabatier of Toulouse for stimulating
and continuous helpful comments. The activities of this group are available
on http://www.lsp.ups-tlse.fr/Fp/Ferraty/staph.html.
References
[AGV04] Aneiros-Pérez, G., González-Manteiga, W., Vieu, P.: Estimation and testing in a partial linear regression model under long-memory dependence.
Bernoulli, 10, 49–78 (2004)
[BZ97]
Bhattacharya, P.K., Zhao, P-L.: Semiparametric inference in a partial
linear model. Ann. Statist., 25, 244–262 (1997)
[CFS03] Cardot, H., Ferraty, F., Sarda, P.: Spline estimators for the functional
linear model. Statist. Sinica, 13, 571–591 (2003)
[C88]
Chen, H.: Convergence rates for parametric components in a partly linear
model. Ann. Statist., 16, 136–146 (1988)
1658
Germán Aneiros-Pérez and Philippe Vieu
[EGRW86] Engle, R., Granger, C., Rice, J., Weiss, A.: Nonparametric estimates
of the relation between weather and electricity sales. J. Amer. Statist.
Assoc., 81, 310–320 (1986)
[FV06]
Ferraty, F., Vieu, P.: Nonparametric functional data analysis. Springer,
New York (in print)
[G95]
Gao, J.T.: The laws of the iterated logarithm of some estimates in partly
linear models. Statist. Probab. Lett., 25, 153–162 (1995)
[HLG00] Härdle, W., Liang, H., Gao, J.: Partially linear models. Physica-Verlag
(2000)
[HT90]
Hastie, T.J., Tibshirani, R.J.: Generalized Additive Models. Chapman&Hall, New York (1990)
[L00]
Liang, H.: Asymptotic normality of parametric part in partially linear
models with measurement error in the nonparametric part. J. Statist.
Plann. Inference, 86, 51–62 (2000)
[RS02]
Ramsay, J., Silverman, B.: Applied functional data analysis. Methods and
case studies. Springer-Verlag (2002)
[RS05]
Ramsay, J., Silverman, B.: Functional data analysis. Springer-Verlag
(2005)
[RB90]
Ritov, Y., Bickel, P.J.: Achieving information bounds in non and semiparametric models. Ann. Statist., 18, 925–938 (1990)
[S96]
Schick, A.: Root-n consistent estimation in partly linear regression models.
Statist. Probab. Lett, 28, 353–358 (1996)
[S88]
Speckman, P.: Kernel smoothing in partial linear models. J. Roy. Statist.
Soc. Ser. B, 50, 413–436 (1988)
Download