Data

Distributed Nuclear Norm Minimization for Matrix Completion Morteza Mardani, Gonzalo Mateos and Georgios Giannakis ECE Department, University of Minnesota Acknowledgments: MURI (AFOSR FA9550-10-1-0567) grant Cesme, Turkey June 19, 2012 1 Learning from “Big Data” `Data are widely available, what is scarce is the ability to extract wisdom from them’ Hal Varian, Google’s chief economist BIG Ubiquitous Fast Productive Smart Messy Revealing K. Cukier, ``Harnessing the data deluge,'' Nov. 2011. 2 Context Preference modeling  Imputation of network data Smart metering Network cartography Goal: Given few incomplete rows per agent, impute missing entries in a distributed fashion by leveraging low-rank of the data matrix. 3 Low-rank matrix completion  Consider matrix , set  Sampling operator ?  Given incomplete (noisy) data (as) has low rank  Goal: denoise observed entries, impute missing ones ? ? ? ? ? ? ? ? ? ? ?  Nuclear-norm minimization [Fazel’02],[Candes-Recht’09] Noisy Noise-free s.t. 4 Problem statement  Network: undirected, connected graph ? ? Goal: Given ? ? ? ? ? ? ? n ? per node and single-hop exchanges, find (P1)  Challenges  Nuclear norm is not separable  Global optimization variable 5 Separable regularization  Key result [Recht et al’11] Lxρ ≥rank[X]  New formulation equivalent to (P1) (P2)  Nonconvex; reduces complexity: Proposition 1. If then stationary pt. of (P2) and is a global optimum of (P1). , 6 Distributed estimator (P3) Consensus with neighboring nodes  Network connectivity (P2) (P3)  Alternating-directions method of multipliers (ADMM) solver  Method [Glowinski-Marrocco’75], [Gabay-Mercier’76]  Learning over networks [Schizas et al’07]  Primal variables per agent : n  Message passing: 7 Distributed iterations 8 Attractive features  Highly parallelizable with simple recursions  Unconstrained QPs per agent  No SVD per iteration  Low overhead for message exchanges  is and is small  Comm. cost independent of network size Recap: (P1) (P2) (P3) Centralized Convex Sep. regul. Nonconvex Consensus Nonconvex Stationary (P3) Stationary (P2) Global (P1) 9 Optimality Proposition 2. If and i) ii) converges to , then: is the global optimum of (P1).  ADMM can converge even for non-convex problems [Boyd et al’11]  Simple distributed algorithm for optimal matrix imputation  Centralized performance guarantees e.g., [Candes-Recht’09] carry over 10 Synthetic data  Random network topology  N=20, L=66, T=66 1 0.8 0.6 0.4  Data   0.2 , 0 0 0.2 0.4 0.6 0.8 1 , 11 Real data  Network distance prediction [Liau et al’12]  Abilene network data (Aug 18-22,2011)  End-to-end latency matrix  N=9, L=T=N  80% missing data Relative error: 10% Data: http://internet2.edu/observatory/archive/data-collections.html 12

Data

Related documents

Products

Support

Data

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib