Bandwidth selectors performance through SiZer Map

advertisement
Bandwidth selectors performance through
SiZer Map
Martı́nez-Miranda, M.D.1 , Raya-Miranda, R.1 , González-Manteiga, W.2 and
González-Carmona, A.1
1
2
Department of Statistics and R.O., University of Granada, Spain
mmiranda@ugr.es, rraya@ugr.es, andresgc@ugr.es
Department of Statistics and R.O., University of Santiago de Compostela, Spain
wenceslao@usc.es
Summary. In this paper, we extend the graphic tool SiZer Map to estimate additive models by backfitting, marginal integration and efficient mixed methods in
order to evaluate the behaviour of several bandwidth selectors. The strategy consists of visualizing their position inside the colour space defined by the maps. It’s
been carried out a simulation study considering different bivariate additive regression models which illustrates and describes a convenient use of SiZer to provide
meaningfully comparisons among bandwidth selectors.
Key words: Additive Model, Backfitting, Marginal Integration, SiZer Map,
Smoothing Parameter
1 Introduction
Additive models in nonparametric regression, ( [PE01]), are formulated by considering the following regression model:
m(x) = α +
D
X
md (xd ) ,
(1)
d=1
where α is a constant term, x = (x1 , . . . , xd )T are the d-dimensional predictor variables and {md , d = 1, ..., D} denotes a set of unknown univariate smooth functions
with E [md (Xd )] = 0.
Different methods have been proposed to estimate the additive models, among
others it outstands the backfitting algorithm ( [BHT89]) and the marginal integration method ( [LN95]). The paper [SLH99] gives a comparative study between both
methods, showing that, in general, neither can be definitively considered superior to
the other. Other efficient methods of estimation in additive models raise the combination of backfitting and marginal integration. In this sense [L97] and [KLH99], propose two steps estimators, i.e., using starting estimations provided by the marginal
integration, within the backfitting algorithm.
1278
Martı́nez-Miranda et al.
SiZer Map is a graphical tool for exploring the features of the data set which
support the estimation curve problem and it can be used to evaluate the behavior
of several bandwidth selectors, by visualizing their position inside the colour space
defined by this curve.
2 Notation and preliminar definitions
Let us assume that
Yi = α +
D
X
md (Xdi ) + εi ,
i = 1, ..., n,
(2)
d=1
where {(X1 , Y1 ) , ..., (Xn , Yn )}, with Xi = (X1i , ..., XDi )T , is a set of independent
and identically random variables. The residuals, εi , are independent and identically
distributed with mean 0 and variance, σ 2 (Xi ).
Under the additive regression model (2) represents the n × n smoother matrix
with respect to the dth covariate vector as Sd . The nonparametric estimation of the
component functions, md , can given by solving the system of normal equations
2
I
6 S2
6
6 .
4 ..
SD
S1
I
..
.
SD
···
···
..
.
···
32
3
2
3
S1
m1
S1
6 m2 7 6 S2 7
S2 7
76
7 6
7
.. 7 6 .. 7 = 6 .. 7 Y.
. 54 . 5 4 . 5
mD
SD
I
(3)
The backfitting algorithm ( [BHT89]) provides an iterative solution of (3). In
( [O00]) derived explicit expressions of the estimators when the matrices Sd are base
b d = Wd Y, d = 1, ..., D, with Wd
on linear smoothers, and it can be written by m
being a matrix of weights (see [O00] for more details).
Under a different perspective, marginal integration method estimates each component md (·) by integrating a pilot multivariate smoother of m involving a D − 1dimensional probability measure. The so-called empirical marginal integration estimator is based on the D-dimensional Nadaraya-Watson estimator, m(x).
e
Briefly, it’s
defined a partition Xi ≡ (Xdi , X−d,i ) (here X−d,i denotes the (D − 1)-dimensional
vector defined by removing Xdi from Xi ), and then, the empirical marginal integration estimator of the d-th component, md (·), is computed by
b
γd (xd ) = n−1
n
X
m(x
e d , X−d,j ).
j=1
b IM (x) =
Then, the additive reconstruction is m
D
X
b
γd (xd ).
d=1
Recently, [KLH99] proposed an efficient oracle estimator which was defined by
inserting the previous defined empirical marginal integration estimator into a backfitting algorithm but taking one step only. The method first constructs responses
by
2−step
Ydi
= Yi −
D
X
j6=d
b
γj (Xji , h0 )
(4)
Bandwidth selectors performance through SiZer Map
1279
with h0 being a pilot scalar bandwidth, and afterwards it’s applying the univariate
2−step
local polynomial smoother to the pairs {(Xdi , Ydi
), i = 1, . . . n} in order to
estimate the d-th component by
2−step
m
bd
(xd ) =
n
X
2−step
wiLP (xd , hd )Ydi
.
(5)
i=1
Here, hd (d = 1, . . . , D) denotes the scalar bandwidths at each component, and the
weights wiLP are those associated to the univariate local polynomial smoother (of a
degree of pd ) (more details of the method can be found in [KLH99]).
3 SiZer Map
SiZer Map ( [CM99]) is a graphical tool first introduced in a univariate context and
recently generalized for the bidimensional case ( [GMC02]). SiZer allows to explore
the features of the data set that supports the estimation curve problem. By using
different colours (or different tones of grey for black and white versions), we can
show the significant increase or decrease of the target curve, considering different
smoothing levels. In this paper, we use the SiZer Map to evaluate the behaviour
of several bandwidth selectors, by visualizing their position inside the colour space
defined by this curve.
SiZer is used to determine the significance of the features of a target curve,
such as peaks and valleys, by considering a family of smoothers, {m
b (x; h) : h ∈
[hmin , hmax ]}. The procedure involves the construction of confidence intervals for
the derivative, m′ (x; h), at the space defined by the smoothing parameter, h.
The extension of SiZer Map to a multidimensional context presents serious difficulties, even with regard to the graphical representation of the maps. One simple
solution is to consider additive models, because these allow us to consider the effect of each covariate separately. Following this idea, an immediate extension to an
additive model would be to construct as many SiZer Maps as there are covariates.
So that [RMG02] develops the expressions necessary to create the SiZer Map for
an additive model estimated using the backfitting algorithm. Based in this work,
we present the expressions necessary to do the SiZer Map for an additive model
estimated using the efficient method ( [KLH99]).
Consider a family of efficient estimators for an additive model such as the one
considered in this paper, {m
b (x; h1 , ..., hD ) : hd ∈ [hd,min , hd;max ] , d = 1, ..., D}, and
define confidence intervals for the derivative of the component functions, m
b ′d (xd ; hd ),
d = 1, ..., D. The d-th curve shows the features of the md component by means of
different colours, in a similar way to that adopted for the univariate context.
The confidence intervals for the derivative of the d-th component are written as
q
′
m
b d (xd ; hd ) ± q
d (m
Var
b ′d (xd ; hd )),
where the quantile, q, is calculated from normal approximations.
The expressions for the derivatives of the components and the variances of these
(r)
derivatives are derived. Let sd,xd , d = 1, . . . , D a polinomial smoother of the r-th
derivative in xd , where
(r)
sd,xd
T
= r!eTr+1 XxTd Wxd Xxd
−1
XxTd Wxd , with eTr+1 a
1280
Martı́nez-Miranda et al.
h
(r)
vector of zeros, (pd + 1) × 1, with a 1 in the r + 1-position. Let Sd
i
i
(r)
= sd,Xdi
T
.
The vectors of the local polinomial regression smoothers of the derivative function,
(r)
md with respect to xd can be defined as
0
(r)
m
bd
=
(r)
Sd
Y −
D
X
1
(r)
b
γj (Xj , h0 )A = Sd Yd2−step ,
(6)
j6=d
where Yd2−step are defined in (4).
Then, the estimator of the first derivative of the component md can be written
∗
as m′d (xd ) = Wd,x
Y , where
d
0
(1)
∗
Wd,x
= S d 1 −
d
D
X
1
wjIM (xd , hd )A .
j6=d
Therefore
∗T
∗
ΣWd,x
,
Var m′d (xd ) = Wd,x
d
d
where Σ = diag ε2i .
b (Xi ), for any pilot estimator, m.
b
The residuals, εi , are estimated by εbi = Yi − m
And by substituting these estimations in the matrix Σ, the previous variance can
be estimated as follow:
∗T b
∗
d m′d (xd ) = Wd,x
Var
ΣWd,x
.
d
d
4 Evaluating bandwidth selectors using SiZer Map
The choice of smoothing parameter (or parameters) becomes a crucial technical
problem in nonparametric and semiparametric regression, because of its direct repercussion on the smoothers performance. Our objective is to evaluate the behaviour of
several bandwidth selectors, including local and global selectors, for the two methods
of estimation. The newness and the most important particularity of the performed
comparison relies ont the fact that it allows to understand how bandwidth selectors
works across the data. This meaningfully comparison can be provided by the convenient use of the SiZer tool which has been extending to here considered additive
nonparametric regression context.
Among the bandwidth selector for additive models we’ve evaluated global bandwidth selectors based on a crossvalidation method, which is available for backfitting,
marginal integration and efficient mixed estimators (presented in section 2), or plugin techniques developed for any of them. The global nature of these measures leads to
smoothing parameter values being constant over the whole estimation interval. Local
bandwidth selectors for nonparametric smoothers provide notable improvements in
the estimation of surfaces by achieving a major adaptation to the subjacent features
of data. Under additivity it has been previously proposed local version of crossvalidation, and bootstrap selectors like the introduced by [MRGG05] which showed a
good performance with all additive smoothers.
For backfitting smoothers, we’ve considered two global bandwidth selectors, a
plug-in selector and other based on cross-validation (both proposed in [OR98]). And
Bandwidth selectors performance through SiZer Map
1281
two local selectors, the bootstrap selector proposed by [MRGG05], the theoretical
optimum (the minimizer of M SE-criterion), and a local crossvalidation selector (an
extension of that proposed by [V91] in a unidimensional situation). For the efficient estimator obtained by the mixed method, we’ve evaluated the local bootstrap
selector and the local optimum.
In the just described spirit we’ve carried out a simulation study by considering
three bivariate additive regression models, but to avoid a too large document we
present only the results for one of the considered models that’s given by:
m1 (x1 ) = 1 − 6x1 + 36x21 − 53x31 + 22x51
and
m2 (x2 ) = sin (5πx2 ) .
The explicative variables were generated from independent normal distributions
with mean 0.5 and variance 1/9. the residuals were generated from a distribution
normal with mean zero and constant variance of 0.25. With these definitions we
generated 100 samples with sizes n = 100.
The local linear smoother involved in the estimations was calculated with a
gaussian kernel, K (x) = (2π)(−1/2) exp −x2 /2 .
Figures 1 and 2 show the Sizer maps for the backfitting estimators ( [OR98]) and
the smoothers derived from the presented mixed estimation method ( [KLH99]).
Family Plot
Family Plot
2
2
1
1
0
0
−1
−1
−2
0
0.2
0.4
0.6
0.8
−2
1
0
0.2
0.6
0.8
1
0.8
1
SiZer Plot
0
0
−0.5
−0.5
log10(h2)
log10(h1)
SiZer Plot
0.4
−1
−1.5
−2
−1
−1.5
−2
0
0.2
0.4
0.6
x
0.8
1
0
0.2
0.4
0.6
z
Fig. 1. SiZer Map for the simulated data with components estimated by Backfitting.
Each figure consists of two types of graphs, the so called Family Plot and the
SiZer map, both being constructed for each component in the model. Family Plot
1282
Martı́nez-Miranda et al.
Family Plot
Family Plot
2
2
1.5
1.5
1
1
0.5
0.5
0
0
−0.5
−0.5
−1
−1
−1.5
−1.5
−2
0.2
0.4
0.6
−2
0.8
0.2
0.6
0.8
SiZer Plot
0
0
−0.2
−0.2
−0.4
−0.4
−0.6
−0.6
−0.8
−0.8
log10(h2)
log10(h1)
SiZer Plot
0.4
−1
−1
−1.2
−1.2
−1.4
−1.4
−1.6
−1.6
−1.8
−1.8
−2
−2
0
0.2
0.4
0.6
x
0.8
1
0
0.2
0.4
0.6
0.8
1
z
Fig. 2. SiZer Map for the simulated data and estimated components using the
Efficient method.
allows us to compare different choices of smoothing level for estimating the components (represented by the showed blue curves), and also it includes estimates using
various specific smoothing parameters. Indeed, the black curve is associated to the
local bootstrap selector, the blue one represents the optimal local theoretical bandwidth; the red curve is the estimation with a local crossvalidation bandwidth; green
shows the global plug-in selector, and yellow is used to plot estimations with a global
crossvalidation bandwidth.
SiZer maps, for each additive component, represent on the horizontal axis the
range of the given covariate and the smoothing levels are displayed on the vertical
level. In the panel it’s showed curves associated to the evaluated bandwidths, both
local an global: the solid black line represents the plug-in selector and the dashed
one shows the global crossvalidation bandwidth. The solid white curve is the local
bootstrap bandwidth, the white dotted curve is associated to the optimal local theoretical bandwidth, and the white dashed curve represents a local crossvalidation
bandwidth.
References
[BHT89]
Buja, A., Hastie, T.J., Tibshirani, R.: Linear Smoothers and Additive
Models (with discussion). Ann. Statist. 17, 453–555 (1989)
Bandwidth selectors performance through SiZer Map
[CM99]
1283
Chaudhuri, P., Marron, J.S.: SiZer for Exploration of Structures in
Curves. J. Amer. Statist. Assoc. 94, 807–823 (1999)
[GMC02]
Godtliebsen, F., Marron, J.S., Chaudhuri, P.: Significance in scale space
for bivariate density estimation. J. Comput. Graphical Statist. 11, 1–22
(2002)
[GMP04]
González-Manteiga, W., Martı́nez-Miranda, M.D., Pérez-González, A.:
The Choice of Smoothing Parameter in Nonparametric Regression
through Wild Bootstrap. Comput. Statist. Data Anal. 47, 487–515
(2004)
[HT90]
Hastie, T.J., Tibshirani, R.: Generalized additive models. Chapman &
Hall (1990)
[KO04]
Kauermann, G., Opsomer, J.D.: Generalized cross-validation for bandwidth selection of backfitting estimates in generalized additive models.
J. Comput. Graphical Statist. 13, 66–89 (2004)
[KLH99]
Kim, W., Linton, O.B., Hengartner, N.W.: A computationally efficient
oracle estimator for additive nonparametric regression with bootstrap
confidence intervals. J. Comput. Graphical Statist. 8, 278–297 (1999)
[L97]
Linton, O.B.: Efficient estimation of additive nonparametric regression
models. Biometrika, 84, 469–473 (1997)
[LN95]
Linton, O.B., Nielsen, J.P.: A kernel method of estimating structured
nonparametric regression based on marginal integration. Biometrika.
82, 93–100 (1995)
[MRGG05] Martı́nez-Miranda, M.D., Raya-Miranda, R., González-Manteiga, W.,
González-Carmona, A.: SiZer Map for evaluating a bootstrap local
bandwidth selector in nonparametric additive models. Technical Report 05-01. Universidad de Santiago de Compostela (2005)
[O00]
Opsomer, J.D.: Asymptotic properties of backfitting estimators. J. Multivariate Anal. 73, 166–179 (2000)
[OR98]
Opsomer, J.D., Ruppert, D.: A fully Automated Bandwidth Selection
Method for Fitting Additive Models. J. Amer. Statist. Assoc. 93, 605–
619 (1998)
[RMG02]
Raya-Miranda, R., Martı́nez-Miranda, M.D., González-Carmona, A.:
Exploring the structure of regression surfaces by using SiZer Map for
additive models. Proceedings in Computational Statistics. XV COMPSTAT. Statistics Netherlands (2002)
[SLH99]
Sperlich, S., Linton, O.B., Härdle, W.: Integration and Backfitting
Methods in Additive Models - Finite Sample Properties and Comparison. Test, 8, 419–458 (1999)
[V91]
Vieu, P.: Nonparametric regression: Optimal local bandwidth choice. J.
Roy. Statist. Soc., 53, 453–464 (1991)
Download