A MapReduce-Based Maximum-Flow Algorithm for Large Small-World Network Graphs Felix Halim, Roland H.C. Yap, Yongzheng Wu Outline • Background and Motivation • Overview • Maximum-Flow algorithm (The Ford-Fulkerson Method) • MapReduce (MR) Framework • Parallelizing the Ford-Fulkerson Method - Incremental update, bi-directional search, multiple excess paths • MapReduce Optimizations - Stateful extension, trading off space vs. number of rounds • Experimental Results • Conclusion Background and Motivation • Large Small-world network Graphs naturally arise in – World Wide Web, Social Netwoks, Biology, etc… – Have been shown to have small diameter and robust • Maximum-flow algorithm on Large Graphs – Isolate a group of spam sites on WWW – Community Identification – Defense against Sybil attack • Challenge: Computation, storage and memory requirements exceed a single machine capacity • Our approach: Cloud/Cluster Computing – Run Max-Flow on top of the MapReduce Framework The Maximum-Flow Overview • Given a flow network G = (V,E), s, t • Residual network Gf = (V, Ef) • Augmenting path – A simple path from s to t in the residual network • To compute the maximum-flow – The Ford-Fulkerson Method O( |f*| E ) While an Augmenting path p is found in Gf Augment the flow along the path p Maximum-Flow Example Residual Network 10 9 6 2 s 7 t 10 9 8 4 source node t sink node intermediate node 0 6 6 7 s 3 Augment p s Legends: 6 2 10 8 9 r t residual = C – F augmenting path p Maximum-Flow Example Residual Network 4 3 6 s 6 2 7 Legends: t 10 9 8 s source node t sink node intermediate node 3 Augment p 4 0 6 6 s 6 2 2 1 7 7 r t 10 0 7 0 6 residual = C – F augmenting path p Maximum-Flow Example Residual Network 4 3 0 6 6 s 6 2 t 10 2 0 1 7 Legends: 7 s source node t sink node 7 3 0 7 7 s 6 1 1 0 7 intermediate node 2 Augment p 0 t 10 1 8 8 r residual = C – F augmenting path p Maximum-Flow Example Flow / Capacity 7/10 7/9 6/6 1/2 s 7/7 8/9 s source node t sink node intermediate node 2 3 0 7 7 s 6 1 1 0 7 t 10 8/8 Max-Flow = 14 Legends: 0 t 10 1 8 8 r residual = C – F augmenting path p MapReduce Framework Overview • Introduced by Google in 2004 • Open source implementation: Hadoop • Operates on very large dataset – Consists of key/value pairs – Across thousands of commodity machines – Order terabytes of data • Abstracts away distributed computing problems – Data partitioning and distribution, load balancing – Scheduling, fault tolerance, communication, etc… MapReduce Model • User Defined Map and Reduce function (stateless) • Input: a list of tuples of key/value pairs (k1/v1) – The user’s map function is applied to each key/value pair – Produces a list of intermediate key/value pairs • Output: a list of tuples of key/value pairs (k2/v2) – The intermediate values are grouped by its key – The user’s reduce function is applied to each group • Each tuple is independent – Can be processed in isolation and massively parallel manner – Total input can be far larger than the total workers’ memory Max-Flow on MR - Observations • Input: Small-world graph with small diameter D – A vertex in the graph is modeled as an input tuple in MR • Naïve translation of the Ford-Fulkerson method to MapReduce requires O(|f*| D) MR rounds – Breadth-first search on MR requires O(D) MR rounds – BFSMR using 20 machines takes 9 rounds and 6 hours for • SWN with ~400M vertices and ~31B edges – It would take years to compute a maxflow (|f*| > 1000) • Question: How to minimize the number of rounds? FF1: A Parallel Ford-Fulkerson • Goal: Minimize the number of rounds thru parallelism • Do more work/round, avoid spilling tasks to the next round – Use speculative execution to increase parallelism – Incremental updates – Bi-directional search • Doubles the parallelism (number of active vertices) • Effectively halves the expected number of rounds • Maintain parallelism (maintain large number of active vertices) – Multiple Excess Paths (k) -> most effective • Each vertex stores k excess paths (avoid becoming inactive) • Result: – Lower the number of rounds required from O(|f*| D) to ~D – Large number of augmenting path/round (MR bottleneck) MR Optimizations • Goal: minimize MR bottlenecks and overheads • FF2 : External worker(s) for stateful MR extension – Reducer for vertex t is the bottleneck for FF1 (bottleneck) – An external process is used to handle augmenting paths acceptance • FF3 : Schimmy Method [Lin10] – Avoid shuffling the master graph • FF4 : Eliminate object instantiations • FF5 : Minimize MR shuffle (communication) costs – Monitor the extended excess paths for saturation and resend as needed – Recompute/reprocess the graph instead of shuffling the delta (not in paper) Experimental Results • Facebook Sub-Graphs Graph #Vertices #Edges Size (HDFS ) MaxSize FB1 21 M 112 M 587 MB 8 GB FB2 73 M 1,047 M 6 GB 54 GB FB3 97 M 2,059 M 13 GB 111 GB FB4 151 M 4,390 M 30 GB 96 GB FB5 225 M 10,121 M 69 GB 424 GB FB6 441 M 31,239 M 238 GB 1,281 GB • Cluster setup – Hadoop v0.21-RC-0 installed on 21 nodes (8-cores Hyper-threaded (2 Intel E5520 @2.27GHz), 3 hard disks (@500GB SATA) and Centos 5.4 (64-bit)) Handling Large Max-Flow Values • FF5MR is able to process FB6 with a very small number of MR rounds (close to graph diameter). MapReduce Optimizations • FF1 (parallel FF) to FF5 (MR optimized) vs. BFSMR Shuffle Bytes Reduction • The bottleneck in MR is the shuffle bytes, FF5 optimizes the shuffled bytes Scalability (#Machines and Graph Size) Conclusion • We showed how to parallelize a sequential max-flow algorithm, minimizing the number of MR rounds – Incremental Updates, Bi-directional Search, and Multiple Excess Paths • We showed MR optimizations for max-flow – Stateful extension, minimize communication costs • Computing max-flow on Large Small-World Graph – Is practical using FF5MR Q&A • Thank you Backup Slides • • • • • • • • Related Works Multiple Excess Paths – Results Edge processed / second – Results MapReduce example (word count) MapReduce execution flow FF1 Map Function FF1 Reduce Function MR Related Work • The Push-Relabel algorithm – Distributed (no global view of the graph required) – Have been developed for SMP architectures – Needs sophisticated heuristics to push the flows • Not Suitable for MapReduce model – Need locks, pull information from its neighbors – Low number of active vertices – Pushing flow to a wrong sub-graph can lead to huge number of rounds Multiple Excess Paths - Effectiveness • The more the k, the less the number of MR rounds required (keep the number of active vertices high) # Edges processed / sec • The larger the graph, the more effective MapReduce Example – Word Count map(key,value) // document name, contents foreach word w in value EmitIntermediate(w, 1); reduce(key,values) // a word, list of counts freq = 0; foreach v in values freq = freq + v Emit(key, freq) MapReduce Execution Flow MapReduce Simple Applications • • • • • • • Word Count Distributed Grep Count URL access frequency Reverse Web-Link Graph Term-Vector per host All these tasks can Inverted Index be completed in Distributed Sort one MR job General MR Optimizations • External worker as stateful Extension for MR – (Dedicated) External workers outside mappers / reducers – Immediately process requests • Don’t need to wait until mappers / reducers complete – Flexible synchronization point • Minimize shuffling intermediate tuples – Can avoid shuffling by re-(process/compute) in the reduce – Use flags in the data structure to prevent re-shuffling • Eliminate object instantiation – Use binary serializations