SkewReduce: *

advertisement
SkewReduce
Skew-Resistant Parallel Processing of Feature-Extracting Scientific
User-Defined Functions
YongChul Kwon
Magdalena Balazinska, Bill Howe, Jerome Rolia*
University of Washington, *HP Labs
Published in SoCC 2010
Motivation
• Science is becoming a data management problem
• MapReduce is an attractive solution
– Simple API, declarative layer, seamless scalability, …
• But it is hard to
– Express complex algorithms and
– Get good performance (14 hours vs. 70 minutes)
• SkewReduce:
– Goal: Scalable analysis with minimal effort
– Toward scalable feature extraction analysis
2
Application 1: Extracting Celestial Objects
• Input
– { (x,y,ir,r,g,b,uv,…) }
• Coordinates
• Light intensities
• …
• Output
– List of celestial objects
•
•
•
•
•
Star
Galaxy
Planet
Asteroid
…
M34 from Sloan Digital Sky Survey
3
Application 2: Friends of Friends
• Simple clustering algorithm used
in astronomy
• Input:
ε
– Points in multi-dimensional space
• Output:
– List of clusters (e.g., galaxy, …)
– Original data annotated with
cluster ID
• Friends
– Two points within ε distance
• Friends of Friends
– Transitive closure of Friends
relation
4
Parallel Friends of Friends
• Partition
• Local clustering
• Merge
– P1-P2
– P3-P4
• Merge
C2
C1
C3
C5
C5 →C3
P1
P3
P2
P4
C4
C4→C3
C6
C6 →C5
C6 →C3
– P1-P2-P3-P4
• Finalize
– Annotate original data
5
Parallel Feature Extraction
INPUT
DATA
Features
•
•
•
•
Partition multi-dimensional input data
Map
Extract features from each partition
Merge (or reconcile) features “Hierarchical Reduce”
Finalize output
6
Problem: Skew
Skew
Task ID
Local Clustering
(MAP)
Merge
(REDUCE)
5 minutes
35 minutes
8 node cluster, 32map/32reduce slots
• The top red line runs for 1.5 hours
Time
7
Unbalanced Computation: Skew
• Computation skew
– Characteristics of an algorithm
• Same amount of input data != Same runtime
O(N log N)
0 friends per particle
~ O(N2)
O(N) friends per particle
Can we scale out off-the-shelf implementation
without (or minimal) modifications?
8
Solution 1?
Micro partition
• Assign tiny amount of work
to each task to reduce skew
9
How about having micro partitions?
8 node cluster, 32map/32reduce slots
• To find sweet spot, need to
try different granularities!
16
Completion time (Hours)
• It works!
• Framework overhead!
14
12
10
8
6
4
2
0
256
1024
4096
8192
# of partitions
Can we find a good partitioning plan
without trial and error?
10
Outline
• Motivation
• SkewReduce
– API (in the paper)
– Partition Optimization
• Evaluation
• Summary
11
Partition Optimization
Serial Feature Extraction
Algorithm
Merge
Algorithm
1
5
6
7
8
9
13
11
10
15
3
4
12
14
2
• Varying granularities of partitions
• Can we automatically find a good partition plan and
schedule?
12
Approach
Runtime Plan
1
5
Sample
Cost
functions
6
SkewReduce
Optimizer
7
8
Cluster
configuration
9
13
11
10
15
3
4
12
14
2
• Goal: minimize expected total runtime
• SkewReduce runtime plan
– Bounding boxes for data partitions
– Schedule
13
Partition Plan Guided By Cost Functions
…
“Given sample, how long will it take to process?”
• Two cost functions:
– Feature cost: (Bounding box, sample, sample rate) → cost
– Merge cost:(Bounding boxes, sample, sample rate) → cost
• Basis of two optimization decisions
– How (axis, point) to split a partition
– When to stop partitioning
14
Search Partition Plan
• Greedy top-down search
– Split if total expected runtime improves
• Evaluate costs for subpartitions and merge
• Estimate new runtime
Original
Possible Split
50
1
Schedule 1
= 60
1
3
2
3
2
100
50
Time
10
Schedule 2
= 110
1
2
3
15
Summary of Contributions
• Given a feature extraction application
– Possibly with computation skew
• SkewReduce
– Automatically partitions input data
– Improves runtime in spite of computation skew
• Key technique: user-defined cost functions
16
Evaluation
• 8 node cluster
– Dual quad core CPU, 16 GB RAM
– Hadoop 0.20.1 + custom patch in MapReduce API
• Distributed Friends of Friends
– Astro: Gravitational simulation snapshot
• 900 M particles
– Seaflow: flow cytometry survey
• 59 M observations
17
Does SkewReduce work?
MapReduce
Astro (18 GB, 3D)
Seaflow (1.9 GB, 3D)
Relative Runtime
10
9
8
7
6
5
4
3
2
1
0
1 hour preparation
128 MB
16 MB
4 MB
2 MB
Manual SkewReduce
14.1
8.8
4.1
5.7
2.0
1.6
87.2
63.1
77.7
98.7
-
14.1
Hours
Minutes
• SkewReduce plan yields 2 ~ 8 times faster running time
18
Impact of Cost Function
Astro
Completion time (Hours)
16
14
12
10
8
6
4
2
0
Data Size
Histogram 1D
Cost Function
Higher fidelity
= Better performance
Histogram 3D
19
Highlights of Evaluation
• Sample size
– Representativeness of sample is important
• Runtime of SkewReduce optimization
– Less than 15% of real runtime of SkewReduce plan
• Data volume in Merge phase
– Total volume during Merge = 1% of input data
• Details in the paper
20
Conclusion
• Scientific analysis should be easy to write, scalable, and
have a predictable performance
• SkewReduce
– API for feature extracting functions
– Scalable execution
– Good performance in spite of skew
• Cost-based partition optimization using a data sample
• Published in SoCC 2010
– More general version is coming out soon!
21
Download