Parallel and Distributed Programming Models and Languages 15-740/18-740 Computer Architecture In-Class Discussion Dong Zhou Kun Li Mike Ralph Why distributed computations? • Buzzword: Big Data • Take sorting as an example – Amount of data that can be sorted in 60 seconds – One computer can read ~60 MB/sec from one disk – 2012 world record • Flat Datacenter Storage by Ed Nightingale et.al • 1470 GB • 256 heterogeneous nodes, 1033 disks • Google indexes 100 billion+ web pages Solution: use many nodes • Grid computing – Hundreds of supercomputers connected by high speed net • Cluster computing – Thousands or tens of thousands of PCs connected by high speed LANS • 1000 nodes potentially give 1000x speedup Distributed computations are difficult to program • • • • • • Sending data to/from nodes Coordinating among nodes Recovering from node failure Optimizing for locality Debugging … MapReduce • A programming model for large-scale computations – Process large amounts of input, produce output – No side-effects or persistent state • MapReduce is implemented as a runtime library – – – – Automatic parallelization Load balancing Locality optimization Handling of machine failures MapReduce design • Input data is partitioned into M splits • Map: extract information on each split – Each map produces R partitions • Shuffle and sort – Bring M partitions to the same reducer • Reduce: aggregate, summarize, filter or transform • Output is in R result files More specifically • Programmer specifies two methods – map(k, v) → <k', v'>* – reduce(k', <v'>*) → <k'', v''>* • All v' with same k' are reduced together • Usually also specify: – partition(k', total partitions) → partition for k’ – often a simple hash of the key Runtime MapReduce is widely applicable • • • • • Distributed grep Distributed clustering Web link graph reversal Detecting approx. duplicate web pages … Dryad • Similar goals as MapReduce – Focus on throughput, not latency – Automatic management of scheduling, distribution, fault tolerance • Computations expressed as a graph – Vertices are computations – Edges are communication channels – Each vertex has several input and output edges Why using a dataflow graph? • Many programs can be represented as a distributed dataflow graph – The programmer may not have to know this • ``SQL-like’’ queries: LINQ • Dryad will run them for you Runtime • Vertices (V) run arbitrary app code • Vertices exchange data through files, TCP pipes etc. • Vertices communicate with JM to report status V • Job Manager (JM) consults name server(NS) to discover available machines. • JM maintains job graph and schedules vertices V V • Daemon process (D) executes vertices Job = Directed Acyclic Graph Outputs Processing vertices Channels (file, pipe, shared memory) Inputs Advantages of DAG over MapReduce • Big jobs more efficient with Dryad – MapReduce: big jobs runs > 1 MR stages • Reducers of each stage write to replicated storage • Output of reduce: 2 network copies, 3 disks – Dryad: each job is represented with a DAG • Intermediate vertices write to local file • … Pig Latin • High-level procedural abstraction of MapReduce • Contains SQL-like primitives • Example: good_urls = FILTER urls BY pagerank > 0.2; groups = GROUP good_urls BY category; big_groups = FILTER groups BY COUNT(good_urls)>106; Output = FOREACH big_groups GENERATE category, AVG(good_urls.pagerank); • Plus user-defined functions (UDFs) Value • Reduces development time • Procedural vs. declarative • Overhead/performance costs worthwhile? C/C++ Assembly Pig Latin MapReduce Green-Marl • High-level graph analysis language/compiler • Uses basic data types and graph primitives • Built-in graph function – BFS, RBFS, DFS • Uses domain specific optimizations – Both non-architecture and architecture specific • Compiler translates Green-Marl to other highlevel language (ex. C++) Tradeoffs • Achieve speedup over hand-tuned parallel equivalents • Tested only on single workstation • Only works with graph representations – Difficulty representing certain data sets and computations • Domain specific vs. general purpose languages • Future work for more architectures, userdefined data structures Questions and Discussion Example: count word frequencies in web page • Input is files with one doc per record • Map parses document into words – key = document URL – value = document contents • Output of map "doc1", "to be or not to be" "to", "1" "be", "1" "or", "1" "not", "1" "to", "1" "be", "1" Example: count word frequencies in web page • Reduce: computes sum for a key key = "be" values = "1", "1" "2" key = "not" values = "1" key = "or" values = "1" "1" "2" • Output of reduce saved "to", "2" "be", "2" "or", "1" "not", "1" key = "to" values = "1", "1" "2" Example: Pseudo-code Map(String input_key, String input_value): //input_key: document name //input_value: document contents for each word w in input_values: EmitIntermediate(w, "1"); Reduce(String key, Iterator intermediate_values): //key: a word, same for input and output //intermediate_values: a list of counts int result = 0; for each v in inermediate_values: result += ParseInt(v); Emit(AsString(result))