Programme and course description

advertisement
Sparse Models for Dependent Data
The goal of this mini-course is to introduce sparse models and its applications to dependent data.
Sparse modeling is a research area which links statistics, machine-learning and signal processing,
motivated by the old, and important, statistical problem of variable selection in high-dimensional
datasets. Selection of a small set of highly predictive variables is central to many applications where the
ultimate objective is to enhance our understanding of underlying data generating process. More
recently, sparse modeling has became popular also in econometrics, where apart from simple
predictions, the identification of causal relations is of tantamount importance.
In recent years a vast number of models and/or algorithms has been proposed, mainly focused on l1regularized optimization. Examples include sparse regression, such as LASSO and its various extensions
(Elastic Net, fused LASSO, group LASSO, simultaneous/multi-task LASSO, adaptive LASSO, etc.), sparse
graphical model selection, sparse dimensionality reduction (sparse PCA, CCA, NMF, etc.) and learning
dictionaries that allow sparse representations. Applications of these methods are wide-ranging,
including economics, finance, marketing, computational biology, neuroscience, image processing, etc.
The course will be organized as follows:
Lecture 1: May 14

Introduction to learning methods and econometrics for dependent data: parametric versus
nonparametric. References: [10], [21]

Sparse dimensionality reduction (factor models, PCA and sparse PCA). References: [10], [21]

Sparse regression models and algorithms (Elastic Net, fused LASSO, group LASSO,
simultaneous/multi-task LASSO, adaptive LASSO, etc.). References: [1], [2], [7], [8], [10], [11],
[13], [14], [15], [16], [18], [21], [22], [23], [24], [25]
Lecture 1: May 15

Applications of sparse modeling to time-series data. References: [9], [12], [17], [19], [20]

Sparse algorithms for instrumental variables and generalized method of moments estimation.
References: [3], [4].
References:
[1]
Belloni, A. and V. Chernozhukov (forthcoming). Least Squares After Model Selection in Highdimensional Sparse Models. Bernoulli.
[2]
Belloni, A. and V. Chernozhukov (2011). L1-Penalized Quantile Regression in High-Dimensional
Sparse Model. The Annals of Statistics, 39: 82-130.
[3]
Belloni, A., V. Chernozhukov, and C. Hansen (2011). Lasso Methods for Gaussian Instrumental
Variables Models. Working paper.
[4]
Caner, M. (2009). Lasso-type GMM estimator. Econometric Theory, 25(01):270–290, 2009.
[5]
Caner, M. and K. Knight (2008). No country for old unity root tests: bridge estimators
differentiate between nonstationary versus stationary models and select optimal lag. Working
Paper, University of Toronto.
[6]
Fan, J. and R. Li (2001). Variable selection via nonconcave penalized likelihood and its oracle
properties. Journal of the American Statistical Association, 96:1348–1360.
[7]
Efron, B., I. Johnstone, T. Hastie and R. Tibshirani (2004). Least Angle Regression. The Annals
of Statistics, 32: 407–499.
[8]
Fan, J. and H. Peng (2004). Nonconcave penalized likelihood with a diverging number of
parameters. The Annals of Statistics, 32(3):928–961.
[9]
Gelper, S. and C. Croux. Time series least angle regression for selecting predictive economic
sentiment
series,
2009.
Working
Paper,
University
of
Rotterdam
(www.econ.kuleuven.be/sarah.gelper/public).
[10] Hastie, T., R. Tibishirani and J. Friedman (2009). The Elements of Statistical Learning: Data
Mining, Inference, and Prediction. Springer.
[11] Hastie, T. and H. Zou (2005). Regularization and variable selection via the elastic net. Journal
of the Royal Statistical Society. Series B (Methodological), 67:301–320.
[12] Hsu, N.,H. Hung, and Y. Chang (2008). Subset selection for vector autoregressive processes
using lasso. Computational Statistics & Data Analysis, 52(7): 3645–3657.
[13] Huang, J., S. Ma, and C.-H. Shang (2008). Adaptive LASSO for sparse high dimensional
regression models. Statistica Sinica, 18:1603–1618.
[14] Huang, J., J. Horowitz, and S. Ma (2009). Asymptotic properties of bridge estimators in sparse
high-dimensional regression models. Annals of Statistics, 36(2):587–613.
[15] Tibshirani, R. (1996). Regression Shrinkage and Selection Via the LASSO. Journal of the Royal
Statistical Society, Series B, 58: 267-288.
[16] Knight, K. and W. Fu (2000). Asymptotics for lasso-type estimators. The Annals of Statistics,
28(5):1356–1378.
[17] Liao, Z. and P. Phillips (2010). Automated estimation of vector error correction models. Work
in progress.
[18] Meinshausen, N. and B. Yu (2009). Lasso-type recovery of sparse representations for high
dimensional data. The Annals of Statistics, 37:246–270.
[19] Nardi, Y. and A. Rinaldo (2011). Autoregressive process modeling via the lasso procedure.
Journal of Multivariate Analysis, 102:528–549.
[20] Song, S. and P. J. Bickel (2011). Large vector autoregressions. ArXiv e-prints.
[21] van der Geer, S. and P. Bühlmann (2011). Statistics for High-Dimensional Data: Methods,
Theory and Applications. Spring Series in Statistics. Springer.
[22] Wang, H., G. Li, and C. Tsai (2007). Regression coefficient and autoregressive order shrinkage
and selection via the lasso. Journal of the Royal Statistical Society: Series B(Statistical
Methodology), 69(1):63–78, 2007.
[23] Yuan, M. and Y. Lin (2006). Model selection and estimation in regression with grouped
variables. Journal of the Royal Statistical Society. Series B (Methodological), 68:49–67, 2006.
[24] Zhao, P. and B. Yu (2006). On model consistency of lasso. Journal of Machine Learning
Research, 7:2541–2563, 2006.
[25] Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical
Association, 101:1418–1429.
Download