Testing index sufficiency with a predicted index Andreas Dzemski November 20, 2015

advertisement
Testing index sufficiency with a predicted
index
Andreas Dzemski∗
November 20, 2015
This paper tackles the problem of testing the null hypothesis of index sufficiency
H0 : E[Y |X] = E[Y |r0 (X)]
when the index rule r0 is unknown and has to be estimated at a first stage.
I extend a testing approach by Delgado and Manteiga 2001 to allow for
predicted variables in the conditioning set of the null model. The class of
permissible estimators of r0 is characterized in terms of assumptions about
their precision and complexity, and comprises a wide range of parametric,
semiparametric and fully nonparametric estimators. I provide a stochastic
expansion that describes how the estimation of the index rule affects the
asymptotic distribution of the test statistic. This expansion holds uniformly
over the class of permissible first-stage estimators. As demonstrated for kernelbased estimators, the first-stage estimation typically affects the asymptotic
distribution of the test statistic. I suggest a multiplier bootstrap procedure
which explicitly accounts for the first-stage estimation error. A rejection rule
based on bootstrapped critical values guarantees that the test is correctly sized.
In contrast to the case of an observed index, employing a higher-order kernel
is not sufficient to eliminate the bias from kernel smoothing. An alternative
procedure that uses an estimator of the bias to center the test statistic works
under relatively weak assumptions about the first-stage estimator.
JEL codes: C12, C14, C52
Keywords: significance test, generated regressors, U-statistic, multiplier
bootstrap
∗
School of Business, Economics and Law, University of Gothenburg.
1
References
Aı̈t-Sahalia, Yacine, Peter J Bickel, and Thomas M Stoker (2001). “Goodness-of-fit tests
for kernel regression with an application to option implied volatilities”. In: Journal
of Econometrics 105.2, pp. 363–412.
Carroll, Raymond J et al. (1997). “Generalized partially linear single-index models”. In:
Journal of the American Statistical Association 92.438, pp. 477–489.
Chen, Song Xi and Ingrid Van Keilegom (2009). “A goodness-of-fit test for parametric and
semi-parametric models in multiresponse regression”. In: Bernoulli 15.4, pp. 955–976.
Das, Mitali, Whitney K Newey, and Francis Vella (2003). “Nonparametric estimation of
sample selection models”. In: The Review of Economic Studies 70.1, pp. 33–58.
Delgado, Miguel A and Wenceslao González Manteiga (2001). “Significance testing in
nonparametric regression based on the bootstrap”. In: Annals of Statistics, pp. 1469–
1507.
Dzemski, Andreas and Florian Sarnetzki (2014). “Overidentification test in a nonparametric treatment model with unobserved heterogeneity”. Working Paper.
Escanciano, Juan Carlos, David Jacho-Chávez, and Arthur Lewbel (2014). “Uniform
convergence of weighted sums of non and semiparametric residuals for estimation
and testing”. In: Journal of Econometrics 178, pp. 426–443.
Escanciano, Juan Carlos and Kyungchul Song (2010). “Testing single-index restrictions
with a focus on average derivatives”. In: Journal of Econometrics 156.2, pp. 377–391.
Fan, Yanqin and Qi Li (1996). “Consistent model specification tests: omitted variables
and semiparametric functional forms”. In: Econometrica: Journal of the econometric
society, pp. 865–890.
Hansen, Bruce (2008). “Uniform convergence rates for kernel estimation with dependent
data”. In: Econometric Theory 24.03, pp. 726–748.
Härdle, Wolfgang and James Marron (1985). “Optimal bandwidth selection in nonparametric regression function estimation”. In: The Annals of Statistics, pp. 1465–
1481.
Hastie, Trevor and Robert Tibshirani (1986). “Generalized additive models”. In: Statistical
science, pp. 297–310.
Heckman, James (1979). “Sample selection bias as a specification error”. In: Econometrica,
pp. 153–161.
Heckman, James J and Edward Vytlacil (2005). “Structural Equations, Treatment Effects,
and Econometric Policy Evaluation1”. In: Econometrica 73.3, pp. 669–738.
Heckman, James, Hidehiko Ichimura, Jeffrey Smith, et al. (1998). “Characterizing Selection Bias Using Experimental Data”. In: Econometrica, pp. 1017–1098.
Heckman, James, Hidehiko Ichimura, and Petra Todd (1998). “Matching as an econometric
evaluation estimator”. In: The Review of Economic Studies 65.2, pp. 261–294.
Ichimura, Hidehiko (1993). “Semiparametric least squares (SLS) and weighted SLS
estimation of single-index models”. In: Journal of Econometrics 58.1, pp. 71–120.
Jones, Chris, James Marron, and Simon Sheather (1996). “A brief survey of bandwidth
selection for density estimation”. In: Journal of the American Statistical Association
91.433, pp. 401–407.
2
Klein, Roger W and Richard H Spady (1993). “An efficient semiparametric estimator for
binary response models”. In: Econometrica, pp. 387–421.
Lavergne, Pascal (2001). “An equality test across nonparametric regressions”. In: Journal
of Econometrics 103.1, pp. 307–344.
Lavergne, Pascal and Quang Vuong (2000). “Nonparametric significance testing”. In:
Econometric Theory 16.04, pp. 576–601.
Lavergne, Pascal, Samuel Maistre, Valentin Patilea, et al. (2015). “A significance test
for covariates in nonparametric regression”. In: Electronic Journal of Statistics 9,
pp. 643–678.
Li, Qi (1999). “Consistent model specification tests for time series econometric models”.
In: Journal of Econometrics 92.1, pp. 101–147.
Li, Qi, Cheng Hsiao, and Joel Zinn (2003). “Consistent specification tests for semiparametric/nonparametric models based on series estimation methods”. In: Journal of
Econometrics 112.2, pp. 295–325.
Maistre, Samuel and Valentin Patilea (2014). “Nonparametric model checks for singleindex assumptions”. Working Paper.
Mammen, Enno, Christoph Rothe, and Melanie Schienle (2012). “Nonparametric regression with nonparametrically generated covariates”. In: The Annals of Statistics 40.2,
pp. 1132–1170.
— (2015). “Semiparametric estimation with generated covariates”. In: Econometric
Theory.
Masry, Elias (1996). “Multivariate local polynomial regression for time series: uniform
strong consistency and rates”. In: Journal of Time Series Analysis 17.6, pp. 571–599.
Newey, Whitney K (2009). “Two-step series estimation of sample selection models”. In:
The Econometrics Journal 12.s1, S217–S229.
Nolan, Deborah and David Pollard (1987). “U-processes: Rates of Convergence”. In: The
Annals of Statistics, pp. 780–799.
Pollard, David (1984). Convergence of stochastic processes. Springer.
Rodrı́guez-Póo, Juan M, Stefan Sperlich, and Philippe Vieu (2015). “Specification testing
when the null is nonparametric or semiparametric”. In: Econometric Theory, pp. 1–
29.
Rosenbaum, Paul R and Donald B Rubin (1983). “The central role of the propensity
score in observational studies for causal effects”. In: Biometrika 70.1, pp. 41–55.
Sherman, Robert P (1994). “Maximal inequalities for degenerate U-processes with
applications to optimization estimators”. In: The Annals of Statistics, pp. 439–459.
Stute, Winfried (1997). “Nonparametric model checks for regression”. In: The Annals of
Statistics, pp. 613–641.
Stute, Winfried and Li-Xing Zhu (2005). “Nonparametric checks for single-index models”.
In: Annals of Statistics, pp. 1048–1083.
van de Geer, Sara (2000). Empirical Processes in M-estimation. Vol. 6. Cambridge
University Press.
van der Vaart, Aad and Jon Wellner (1996). Weak Convergence and Empirical Processes.
Springer.
3
Vytlacil, Edward (2002). “Independence, monotonicity, and latent index models: An
equivalence result”. In: Econometrica 70.1, pp. 331–341.
Xia, Yingcun and Wolfgang Härdle (2006). “Semi-parametric estimation of partially
linear single-index models”. In: Journal of Multivariate Analysis 97.5, pp. 1162–1184.
Xia, Yingcun, WK Li, et al. (2004). “A goodness-of-fit test for single-index models”. In:
Statistica Sinica 14.1, pp. 1–28.
4
Download