GraphSC: Parallel Secure Computation Made Easy Kartik Nayak With Xiao Shaun Wang, Stratis Ioannidis, Udi Weinsberg, Nina Taft, Elaine Shi 1 Data Mining on User Data Users Data Data Data Mining Engine Privacy concern! Data Model 2 Companies Computing on Private Data Graph representing social connections Graph representing professional connections Compute user’s influence in both circles 3 Companies want to run machine learning algorithms Users/Companies do NOT want to reveal data Can we enable this in practice? 4 Cryptography to the rescue: Secure Multiparty Computation Ensures that we learn only the outcome 5 Key Challenges Generic Solutions 1 Lot of work improving individual algorithms Departure from one-at-a-time approach 6 Key Challenges 2 Convert Program to Run on Secure Computation (Cost of obliviousness) 7 Key Challenges 3 Parallelizability There’s a lot of data – maintain benefits of parallelism in the insecure setting With cryptography, expensive computation 8 Key Contributions 9 Key Contributions Challenge: Generic Solutions Generic Framework for “Graph-parallel” Algorithms PageRank Pregel by Risk Minimization using ADMM Matrix Factorization using ALS Matrix Factorization using gradient descent And many more 10 Key Contributions Efficiently Convert GraphChallenge: Convert program to parallel Programs to run on Secure Oblivious Programs Computation Total work blowup is O(log |V|) Blowup for naïve solution: O(|V|) for sparse graphs 11 Key Contributions Challenge: Parallelizability Maintain Parallelizability Depth of the computation is O(log |V|) Matrix Factorization: 4K ratings, 32 threads [NIWJTB’13] 1.4 hours < 4 mins 12 Key Contributions 1 Generic Framework for Graph-parallel Algorithms 2 Efficiently Convert to Oblivious Programs 3 Maintain Parallelizability 13 Programmer’s favorite model Cryptographer’s favorite model function bs(val, s, t) mid = (s + t) / 2; if (val < mem[mid]) bs(val, 0, mid) else bs(val, mid+1, t) 14 Programmer’s model: Programs Oblivious Programs Cryptographer’s model: Circuits Intuitively, Program traces should not depend on input data 15 Programmer’s favorite model Cryptographer’s favorite model function bs(val, s, t) mid = (s + t) / 2; if (val < mem[mid]) bs(val, 0, mid) else bs(val, mid+1, t) 16 Programmer’s Hard model: Programs Oblivious Programs Easy Cryptographer’s model: Circuits Intuitively, Program traces should not depend on input data 17 Achieving Parallelism Oblivious Parallel RAM [BCP’14] Polylogarithmic Blowup: Not practical GraphSC: O(log |V|) blowup Goal: Low Depth Circuits 18 Pregel by “Graph-parallel” algorithms [LGKB’10, GLGBG’12, MABDHLC’10, ZCF’10] 19 Graph-parallel Algorithms 1 0 C 1 1 2 D Scatter: Send data to edges 2 1 A 5 Gather: Aggregate data from 7 1 B 4 1 edges 4 Apply: Perform some computation 20 Obliviousness of Graph-parallel Algorithms 1 1 D Do not reveal edge/vertex data 2 1 0 C Do not reveal structure of the graph A 1 4 B 1 7 Naïve Solution: O(|V|2) Our Solution: O(|E| log|V|) 21 Oblivious Gather – Key Trick 2 3 1 4 22 Oblivious Gather – Key Trick Oblivious Sort with (v, isVertex) Single pass Sort: O(|E| log |V|) Single pass: O(|E|) Oblivious Gather: (|E| log |V|) Gather in clear: O(|E|) 23 Complexity of Our Algorithms Sequential Insecure Naïve Oblivious Parallel Oblivious Parallel Oblivious (Total Work) (Total Work) (Total Work) (Parallel Time) Scatter Gather O(|E|) O(|V|2) O(|E| log |V|) O(log |V|) Apply O(|V|) O(|E|) O(|E|) O(1) 24 Algorithms on GraphSC Histogram computation PageRank Matrix Factorization using gradient descent Matrix Factorization using alternating least squares Bellman-Ford shortest path Pregel by Bipartite matching Parallel empirical risk minimization through alternating direction method of multipliers (ADMM) 25 Experimental Setup Cloud 1 (Garblers) Cloud 2 (Evaluators) … Two Scenarios: 1. LAN 2. Across Data Centers (WAN) 26 … Key Evaluation Results Input Size Parallel Time (32 processors) Histogram 1K – 0.5M 4 sec – 34 min PageRank (1 iteration) 4K – 128K 20 sec – 15.5 min Using GD 1K – 32K Using ALS 64 – 4K Matrix Factorization (1 iteration) 47 sec – 34 min 2 min – 2.35 hours 27 Running at Scale We used only 7 Matrix Factorizationmachines! using gradient descent: 1M ratings, 6K users, 4K movies [KBV’09] 4K ratings, 32 threads -> 1.4 few hours mins < 4 mins Time taken: ~13 13hours hours (1 iteration) by using more Max:machines 16K ratings (64x smaller data) [NIWJTB’13] 7 machine cluster, 128 processors, 525 GB RAM 28 Across Data Centers Page Rank Garblers: Oregon Evaluators: N. Virginia B/W provisioned: 2 Gbps Time reduces linearly with increasing processors 29 Conclusion GraphSC is a parallel secure computation framework for Graph-parallel algorithms www.oblivm.com Thank You! kartik@cs.umd.edu 30