Distributed Lasso for In-Network Linear Regression Juan Andrés Bazerque, Gonzalo Mateos and Georgios B. Giannakis March 16, 2010 Acknowledgements: ARL/CTA grant DAAD19-01-2-0011, NSF grants CCF-0830480 and ECCS-0824007 Distributed sparse estimation Data acquired by J agents agent j Linear model with sparse common parameter (P1) Zou, H. “The Adaptive Lasso and its Oracle Properties,” Journal of the American Statistical Association, 101(476), 1418-1429, 2006. 2 Network structure (P1) Decentralized Centralized Fusion center Ad-hoc Scalability Robustness Lack of infrastructure Problem statement Given data yj and regression matrices Xj available locally at agents j=1,…,J solve (P1) with local communications among neighbors (in-network processing) 3 Motivating application Scenario: Wireless Communications Spectrum cartography Frequency (Mhz) Goal: Find PSD map space across and frequency Specification: coarse approx. suffices Approach: basis expansion of J.-A. Bazerque, and G. B. Giannakis, “Distributed Spectrum Sensing for Cognitive Radio Networks by Exploiting Sparsity,” IEEE Transactions on Signal Processing, vol. 58, no. 3, pp. 1847-1862, March 2010. 4 Modeling Sources Sensing radios Frequency bases Sensed frequencies Sparsity present in space and frequency 5 Space-frequency basis expansion Superimposed Tx spectra measured at Rj Average path-loss Frequency bases Linear model in 6 Consensus-based optimization (P1) Consider local copies and enforce consensus Introduce auxiliary variables for decomposition (P2) (P1) equivalent to (P2) distributed implementation 7 Towards closed-form iterates Introduce additional variables (P3) Idea: reduce to orthogonal problem 8 Alternating-direction method of multipliers Augmented Lagrangian vars , , multipliers , , st step: minimize w.r.t. AD-MoM AD-MoM 1 2st minimize w.r.t. st step: AD-MoM 3 step: w.r.t. AD-MoM 4st step: minimize update multipliers D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation: Numerical Methods, 2nd ed. Athena-Scientific, 1999. 9 D-Lasso algorithm Agent j initializes and locally runs FOR k = 1,2,… Exchange with agents in Update END FOR offline, inversion NjxNj 10 D-Lasso: Convergence Proposition For every , local estimates generated by D-Lasso satisfy where (P1) Attractive features Consensus achieved across the network Affordable communication of sparse with neighbors Network-wide data percolates through exchanges Distributed numerical operation 11 Power spectrum cartography 5 sources Ns=121 candidate locations, J=50 sensing radios, p=969 iteration Error evolution Aggregate spectrum map Convergence to centralized counterpart D-Lasso localizes all sources through variable selection 12 Conclusions and future directions Sparse linear model with distributed data Lasso estimator Ad-hoc network topology D-Lasso Guaranteed convergence for any constant step-size Linear operations per iteration Application: Spectrum cartography Map of interference across space and time Multi-source localization as a byproduct Future directions Online distributed version Asynchronous updates Thank You! D. Angelosante, J.-A. Bazerque, and G. B. Giannakis, “Online Adaptive Estimation of Sparse Signals: Where RLS meets the 11-norm,” IEEE Transactions on Signal Processing, vol. 58, 2010 (to appear). 13 Leave-one-agent-out cross-validation Agent j is set aside in round robin fashion agents estimate compute repeat for λ= λ1,…, λN and select λmin to minimize the error c-v error vs λ path of solutions Requires sample mean to be computed in distributed fashion 14 Test case: prostate cancer antigen 67 patients organized into J = 7 groups measures the level of antigen for patient n in group j p = 8 factors: lcavol, lweight, age, lbph, svi, lcp, gleason, pgg45 Rows of store factors measured in patients Lasso D-Lasso Centralized and distributed solutions coincide Volume of cancer affects predominantly the level of antigen 15 Distributed elastic net Quadratic term regularizes the solution; centralized in [Zou-Zhang’09] Ridge regression Elastic net Elastic net achieves variable selection on ill-conditioned problems H. Zou and H.H. Zhang, “On The Adaptive Elastic-Net With A Diverging Number of Parameters," Annals of Statistics, vol. 37, no. 4, pp. 1733-1751 2009. 16