Slides - Applied parallel Computing

advertisement
A Model of Computation for
MapReduce
Karloff, Suri and Vassilvitskii
(SODA’10)
Presented by Ning Xie
Why MapReduce



Tera- and petabytes data set (search
engines, internet traffic,
bioinformatics, etc)
Need parallel computing
Requirement: easy to program,
reliable, distributed
What is MapReduce



A new framework for parallel
computing originally developed at
Google (before ’04)
Widely adopted and became
standard for large scale data analysis
Hadoop (open source version) is
being used by Yahoo, Facebook,
Adobe, IBM, Amazon, and many
institutions in academia
What is MapReduce (cont.)

Three-stage operations:
• Map-stage: mapper operates on a single
pair <key, value>, outputs any number
of new pairs <key’, value’>
• Shuffle-stage: all values that are
associated to an individual key are sent
to a single machine (done by the system)
• Reduce-stage: reducer operates on the
all the values and outputs a multiset of
<key, value>
What is MapReduce (cont.)



Map operation is stateless (parallel)
Shuffle stage is done automatically
by the underlying system
Reduce stage can only start when all
Map operations are done
(interleaving between sequential and
parallel)
An example: kth frequency moment
of a large data set
Input: x 2Σn , Σ is a finite set of
symbols
 Let f(¾) be the frequency of symbol ¾
note: ¾f(¾)=n
 Want to compute ¾fk(¾)

An example (cont.)

Input to each mapper: <i, xi>
• M1(<i, xi>)= <xi , i>

(i is the index)
Input to each reducer: <xi,{i1, i2,…,
im}>
• R1(<xi,{i1, i2,…, im}>)=<xi, mk>


Each mapper: M2(<xi, v>)=<$, v>
A single reducer:
R2(<$,{v1,…,vl}>=<$, ivi>
Formal Definitons


A MapReduce program consists of a
sequence <M1, R1, M2, R2,…, Ml, Rl> of
mappers and reducers
The input is U0, a multiset of
<key,value>
Formal Definitons

Execution of the program
For r=1,2,…,l
1. feed each <k,v> in Ur-1 to mapper Mr
Let the output be U’r
2. for each key k, construct the multiset
Vk,r s.t. <k, vi> 2 Ur-1
3. for each k, feed k and some perm. of
Vk,r to a separate instance of Rr. Let Ur be
the multiset of <key, value> generated
by Rr
The MapReduce Class (MRC)

On input {<key,value>} of size n
• Memory: each mapper/reducer uses O(n1-² )
space
• Machines: There are £(n1-²) machines available
• Time: each machine runs in time polynomial in
n, not polynomial in the length of the input
they receive
• Randomized algorithms for map and reduce
• Rounds: Shuffle is expensive


MRCi : num. of rounds=O(login)
DMRC: the deterministic variant
Comparing MRC with PRAM

Most relevant classical computation
model is the PRAM (Parallel Random
Access Machine) model
The corresponding class is NC
Easy relation: MRC µ P
Lemma: If NC  P, then MRC * NC

Open question: show that DMRC  P



Comparing with PRAM (cont.)

Simulation lemma:
Any CREW (concurrent read exclusive write) PRAM
algorithm using O(n2-2²) total memory and
O(n2-2²) processors and runs in time t(n) can be
simulated by an algorithm in DMRC which runs in
O(t(n)) rounds
Example: Finding an MST


Problem: find the minimum spanning tree
of a dense graph
The algorithm
• Randomly partition the vertices into k parts
• For each pair of vertex sets, find the MST of
the bipartite subgraph induce by these two
sets
• Take the union of all the edges in the MST of
each pair, call the graph H
• Compute an MST of H
Finding an MST (cont.)

The algorithm is easy to parallelize
• The MST of each subgraph can be
computed in parallel

Why it works?
• Theorem: the MST tree of H is an MST
of G
• Proof: we did not discard any relevant
edge when sparsify the input graph G
Finding an MST (cont.)

Why the algorithm in MRC?
• Let N=|V| and m=|E|=N1+c
• So input size n satisfies N=n1/1+c
• Pick k=Nc/2
• Lemma: with high probability, the size of
each bipartite subgraph has size N1+c/2
• so the input to any reducer is n1-²
• The size of H is also n1-²
Functions Lemma


A very useful building block for designing
MapReduce algorithms
Definition [MRC-parallelizable function]: Let
S be a finite set. We say a function f on S is
MRC-parallelizable if there are functions g
and h so that the following hold:
• For any partition of S, S = T1 [ T2 [ …[ Tk
• f can be written as: f(S) =h(g(T1), g(T2),…
g(Tk)).
• g and h each can be represented in O(logn) bits.
• g and h can be computed in time polynomial in
|S| and all possible outputs of g can be
expressed in O(logn) bits.
Functions Lemma (cont.)

Lemma (Functions Lemma)
Let U be a universe of size n and let S = {S1,…,
Sk} be a collection of subsets of U, where k· n2-3²
and i=1k|Si|· n2-2². Let F = {f1, …, fk} be a
collection of MRC-parallelizable functions. Then
the output f1(S1), …, fk(Sk) can be computed
using O(n1-²) reducers each with O(n1-²) space.
Functions Lemma (cont.)

The power of the lemma
• Algorithm designer may focus only on the
structure of the subproblem and the input
• Distribution the input across reducer is
handled by the lemma (existence
theorem)

Proof of the lemma is not easy
• Use universal hashing
• Use Chernoff bound, etc
Application of the Functions
Lemma: s-t connectivity



Problem: given a graph G and two
nodes, are they connected in G?
Dense graphs: easy, powering
adjacency matrix
Sparse graphs?
A logn-round MapReduce algorithm
for s-t connectivity


Initially every node is active
For i=1,2,…, O(logn) do
• Each active node becomes a leader with
probability ½
• For each non-leader active node u, find a node
v in the neighbor of u’s current connected
component
• If the connected component of v is non-empty,
then u become passive and re-label each node
in u’s connected component with v’s label

Output TRUE if s and t have the same
label, FALSE otherwise
Conclusions



A rigorous model for MapReduce
Very loose requirements for the
hardware requirements
Call for more research in this
direction
THANK YOU!
Download