Sparse Models for Dependent Data The goal of this mini-course is to introduce sparse models and its applications to dependent data. Sparse modeling is a research area which links statistics, machine-learning and signal processing, motivated by the old, and important, statistical problem of variable selection in high-dimensional datasets. Selection of a small set of highly predictive variables is central to many applications where the ultimate objective is to enhance our understanding of underlying data generating process. More recently, sparse modeling has became popular also in econometrics, where apart from simple predictions, the identification of causal relations is of tantamount importance. In recent years a vast number of models and/or algorithms has been proposed, mainly focused on l1regularized optimization. Examples include sparse regression, such as LASSO and its various extensions (Elastic Net, fused LASSO, group LASSO, simultaneous/multi-task LASSO, adaptive LASSO, etc.), sparse graphical model selection, sparse dimensionality reduction (sparse PCA, CCA, NMF, etc.) and learning dictionaries that allow sparse representations. Applications of these methods are wide-ranging, including economics, finance, marketing, computational biology, neuroscience, image processing, etc. The course will be organized as follows: Lecture 1: May 14 Introduction to learning methods and econometrics for dependent data: parametric versus nonparametric. References: [10], [21] Sparse dimensionality reduction (factor models, PCA and sparse PCA). References: [10], [21] Sparse regression models and algorithms (Elastic Net, fused LASSO, group LASSO, simultaneous/multi-task LASSO, adaptive LASSO, etc.). References: [1], [2], [7], [8], [10], [11], [13], [14], [15], [16], [18], [21], [22], [23], [24], [25] Lecture 1: May 15 Applications of sparse modeling to time-series data. References: [9], [12], [17], [19], [20] Sparse algorithms for instrumental variables and generalized method of moments estimation. References: [3], [4]. References: [1] Belloni, A. and V. Chernozhukov (forthcoming). Least Squares After Model Selection in Highdimensional Sparse Models. Bernoulli. [2] Belloni, A. and V. Chernozhukov (2011). L1-Penalized Quantile Regression in High-Dimensional Sparse Model. The Annals of Statistics, 39: 82-130. [3] Belloni, A., V. Chernozhukov, and C. Hansen (2011). Lasso Methods for Gaussian Instrumental Variables Models. Working paper. [4] Caner, M. (2009). Lasso-type GMM estimator. Econometric Theory, 25(01):270–290, 2009. [5] Caner, M. and K. Knight (2008). No country for old unity root tests: bridge estimators differentiate between nonstationary versus stationary models and select optimal lag. Working Paper, University of Toronto. [6] Fan, J. and R. Li (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96:1348–1360. [7] Efron, B., I. Johnstone, T. Hastie and R. Tibshirani (2004). Least Angle Regression. The Annals of Statistics, 32: 407–499. [8] Fan, J. and H. Peng (2004). Nonconcave penalized likelihood with a diverging number of parameters. The Annals of Statistics, 32(3):928–961. [9] Gelper, S. and C. Croux. Time series least angle regression for selecting predictive economic sentiment series, 2009. Working Paper, University of Rotterdam (www.econ.kuleuven.be/sarah.gelper/public). [10] Hastie, T., R. Tibishirani and J. Friedman (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer. [11] Hastie, T. and H. Zou (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society. Series B (Methodological), 67:301–320. [12] Hsu, N.,H. Hung, and Y. Chang (2008). Subset selection for vector autoregressive processes using lasso. Computational Statistics & Data Analysis, 52(7): 3645–3657. [13] Huang, J., S. Ma, and C.-H. Shang (2008). Adaptive LASSO for sparse high dimensional regression models. Statistica Sinica, 18:1603–1618. [14] Huang, J., J. Horowitz, and S. Ma (2009). Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Annals of Statistics, 36(2):587–613. [15] Tibshirani, R. (1996). Regression Shrinkage and Selection Via the LASSO. Journal of the Royal Statistical Society, Series B, 58: 267-288. [16] Knight, K. and W. Fu (2000). Asymptotics for lasso-type estimators. The Annals of Statistics, 28(5):1356–1378. [17] Liao, Z. and P. Phillips (2010). Automated estimation of vector error correction models. Work in progress. [18] Meinshausen, N. and B. Yu (2009). Lasso-type recovery of sparse representations for high dimensional data. The Annals of Statistics, 37:246–270. [19] Nardi, Y. and A. Rinaldo (2011). Autoregressive process modeling via the lasso procedure. Journal of Multivariate Analysis, 102:528–549. [20] Song, S. and P. J. Bickel (2011). Large vector autoregressions. ArXiv e-prints. [21] van der Geer, S. and P. Bühlmann (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Spring Series in Statistics. Springer. [22] Wang, H., G. Li, and C. Tsai (2007). Regression coefficient and autoregressive order shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B(Statistical Methodology), 69(1):63–78, 2007. [23] Yuan, M. and Y. Lin (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society. Series B (Methodological), 68:49–67, 2006. [24] Zhao, P. and B. Yu (2006). On model consistency of lasso. Journal of Machine Learning Research, 7:2541–2563, 2006. [25] Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101:1418–1429.