DryadLINQ: Making Large-Scale Distributed

advertisement
The DryadLINQ Approach to
Distributed Data-Parallel Computing
Yuan Yu
Microsoft Research Silicon Valley
Distributed Data-Parallel Computing
• Dryad talk: the execution layer
– How to reliably and efficiently execute distributed
data-parallel programs on a compute cluster?
• This talk: the programming model
– How to write distributed data-parallel programs for
a compute cluster?
The Programming Model
• Sequential, single machine programming abstraction
• Same program runs on single-core, multi-core, or
cluster
• Preserve the existing programming environments
– Modern programming languages (C# and Java) are very good
• Expressive language and data model
• Strong static typing, GC, generics, …
– Modern IDEs (Visual Studio and Eclipse) are very good
• Great debugging and library support
• Legacy code could be easily reused
Dryad and DryadLINQ
DryadLINQ provides automatic query plan generation
Dryad provides automatic distributed execution
Outline
•
•
•
•
Programming model
DryadLINQ
Applications
Discussions and conclusion
LINQ
• Microsoft’s Language INtegrated Query
– Available in .NET3.5 and Visual Studio 2008
• A set of operators to manipulate datasets in .NET
– Support traditional relational operators
• Select, Join, GroupBy, Aggregate, etc.
– Integrated into .NET programming languages
• Programs can invoke operators
• Operators can invoke arbitrary .NET functions
• Data model
– Data elements are strongly typed .NET objects
– Much more expressive than relational tables
• For example, nested data structures
LINQ Framework
.Net
program
(C#, VB,
F#, etc)
Query
Objects
LINQ provider interface
Local machine
Execution engines
DryadLINQ
PLINQ
Scalability
Cluster
Multi-core
LINQ-to-SQL
LINQ-to-Obj
Single-core
A Simple LINQ Query
IEnumerable<BabyInfo> babies = ...;
var results = from baby in babies
where baby.Name == queryName &&
baby.State == queryState &&
baby.Year >= yearStart &&
baby.Year <= yearEnd
orderby baby.Year ascending
select baby;
A Simple PLINQ Query
IEnumerable<BabyInfo> babies = ...;
var results = from baby in babies.AsParallel()
where baby.Name == queryName &&
baby.State == queryState &&
baby.Year >= yearStart &&
baby.Year <= yearEnd
orderby baby.Year ascending
select baby;
A Simple DryadLINQ Query
PartitionedTable<BabyInfo> babies =
PartitionedTable.Get<BabyInfo>(“BabyInfo.pt”);
var results = from baby in babies
where baby.Name == queryName &&
baby.State == queryState &&
baby.Year >= yearStart &&
baby.Year <= yearEnd
orderby baby.Year ascending
select baby;
DryadLINQ Data Model
Partition
.Net objects
Partitioned Table
Partitioned table exposes metadata information
– type, partition, compression scheme, serialization, etc.
Demo
• It is just programming
– The same familiar programming languages,
development tools, libraries, etc.
K-means Execution Graph
C0
ac
C1
cc
ac
ac
P1
P2
P3
ac
C2
cc
ac
ac
ac
cc
ac
ac
C3
K-means in DryadLINQ
public class Vector {
public double[] entries;
[Associative]
public static Vector operator +(Vector v1, Vector v2) { … }
public static Vector operator -(Vector v1, Vector v2) { … }
public double Norm2() { …}
}
public static Vector NearestCenter(Vector v, IEnumerable<Vector> centers) {
return centers.Aggregate((r, c) => (r - v).Norm2() < (c - v).Norm2() ? r : c);
}
public static IQueryable<Vector> Step(IQueryable<Vector> vectors, IQueryable<Vector> centers) {
return vectors.GroupBy(v => NearestCenter(v, centers))
.Select(group => group.Aggregate((x,y) => x + y) / group.Count());
}
var vectors = PartitionedTable.Get<Vector>("dfs://vectors.pt");
var centers = vectors.Take(100);
for (int i = 0; i < 10; i++) {
centers = Step(vectors, centers);
}
centers.ToPartitionedTable<Vector>(“dfs://centers.pt”);
PageRank Execution Graph
cc
N01
N02
ae
D
ae
N11
cc
D
ae
N12
D
N13
cc
N03
cc
E1
E2
ae
D
ae
N21
cc
D
ae
N22
D
N23
cc
E3
cc
ae
D
ae
cc
D
ae
N31
D
N32
cc
N33
PageRank in DryadLINQ
public static IQueryable<Rank> Step(IQueryable<Page> pages,
IQueryable<Rank> ranks) {
// join pages with ranks, and disperse updates
var updates = from page in pages
join rank in ranks on page.name equals rank.name
select page.Disperse(rank);
public struct Page {
public UInt64 name;
public Int64 degree;
public UInt64[] links;
public Page(UInt64 n, Int64 d, UInt64[] l) {
name = n; degree = d; links = l; }
// re-accumulate.
return from list in updates
from rank in list
group rank.rank by rank.name into g
select new Rank(g.Key, g.Sum());
public Rank[] Disperse(Rank rank) {
Rank[] ranks = new Rank[links.Length];
double score = rank.rank / this.degree;
for (int i = 0; i < ranks.Length; i++) {
ranks[i] = new Rank(this.links[i], score);
}
return ranks;
}
}
var pages = PartitionedTable.Get<Page>(“dfs://pages.pt”);
var ranks = pages.Select(page => new Rank(page.name, 1.0));
// repeat the iterative computation several times
for (int iter = 0; iter < n; iter++) {
ranks = Step(pages, ranks);
}
}
public struct Rank {
public UInt64 name;
public double rank;
public Rank(UInt64 n, double r) {
name = n; rank = r; }
}
ranks.ToPartitionedTable<Rank>(“dfs://ranks.pt”);
MapReduce in DryadLINQ
MapReduce(source,
// sequence of Ts
mapper,
// T -> Ms
keySelector, // M -> K
reducer)
// (K, Ms) -> Rs
{
var map = source.SelectMany(mapper);
var group = map.GroupBy(keySelector);
var result = group.SelectMany(reducer);
return result; // sequence of Rs
}
// Ex: Count the frequencies of words in a book
MapReduce(book,
line => line.Split(' '),
w => w.ToLower(),
g => g.Count())
DryadLINQ System Architecture
Client machine
Cluster
Dryad
DryadLINQ
.NET program
LINQ query Query Expr
Distributed Invoke
query plan
Query
plan
Vertex
code
Input
Tables
Dryad Execution
foreach
.Net Objects
Output
(11)
Table
Results
Output Tables
DryadLINQ
• Distributed execution plan generation
– Static optimizations: pipelining, eager aggregation, etc.
– Dynamic optimizations: data-dependent partitioning,
dynamic aggregation, etc.
• Vertex runtime
–
–
–
–
–
Single machine (multi-core) implementation of LINQ
Vertex code that runs on vertices
Data serialization code
Callback code for runtime dynamic optimizations
Automatically distributed to cluster machines
A Simple Example
• Count the frequencies of words in a book
var map = book.SelectMany(line => line.Split(' '));
var group = map.GroupBy(w => w.ToLower());
var result = group.Select(g => g.Count());
Naïve Execution Plan
M
D
D
.....
M
Map
D
Distribute
Merge
MG
MG
MG
G
G
G
GroupBy
R
R
R
Reduce
X
X
X
Consumer
…
reduce
g.Count()
M
map
line.Split(' ')
M
Map
G1
G1
G1
GroupBy
IR
IR
IR
InitialReduce
D
D
D
Distribute
g.Sum()
g.Sum()
MG
MG
Merge
G2
G2
GroupBy
C
C
Combine
MG
MG
Merge
G3
G3
GroupBy
F
F
FinalReduce
X
X
Consumer
aggregation tree
M
reduce
g.Count()
M
map
Execution Plan Using Partial Aggregation
Challenge: Program Analysis Support
• The main sources of difficulty
– Complicated data model
– User-defined functions all over the places
• Requires sophisticated static program
analysis at byte-code level
– Mainly some form of flow analysis
• Possible with modern programming
languages and runtimes, such as C#/CLR
Inferring Dataset Property
• Useful for query optimizations
Hash(y => y.a+y.b)
Select(x => new { A=x.a+x.b, B=f(x.c) }
Hash(z => z.A)
GroupBy(x => x.A)
No data partitioning at GroupBy
Inferring Dataset Property
• Useful for query optimizations
Hash(y => y.a+y.b)
Select(x => Foo(x) }
Hash(x => x.A)?
GroupBy(x => x.A)
Need IL-level data-flow analysis of Foo
Caching Query Results
• A cluster-wide caching service to support
– Reuse of common subqueries
– Incremental computations
• Cache: [Key  Value]
– Key is <q, d>, Value is the result of q(d)
– Key requires a pretty good reachability analysis
• For correctness, Key must include “everything”
reachable from q
• For performance, Key should only contain things q
depends on
More Static Analysis
• Purity checking
– All the functions called in DryadLINQ queries must be
side-effect free
– DryadLINQ (and PLINQ) doesn’t enforce it
• Metadata validity checking
– Partitioned table’s metadata contains the record type,
partition scheme, serialization functions, …
– Need to determine if the metadata is valid
– DryadLINQ doesn’t fully enforce it
• Static enforcement of program properties for
security and privacy mechanisms
Examples of DryadLINQ Applications
• Data mining
– Analysis of service logs for network security
– Analysis of Windows Watson/SQM data
– Cluster monitoring and performance analysis
• Graph analysis
– Accelerated Page-Rank computation
– Road network shortest-path preprocessing
• Image processing
– Image indexing
– Decision tree training
– Epitome computation
• Simulation
– light flow simulations for next-generation display research
– Monte-Carlo simulations for mobile data
• eScience
– Machine learning platform for health solutions
– Astrophysics simulation
Decision Tree Training
Mihai Budiu, Jamie Shotton et al
Learn a decision tree to classify pixels in a
large set of images
label
image
Machine
learning
1M images x
10,000 pixels x
2,000 features x
221 tree nodes
Decision
Tree
Complexity >1020 objects
Sample Execution plan
Initial empty tree
Image inputs, partitioned
Read, preprocess
Redistribute
Histogram
Regroup histograms on node
Compute new tree layer
Compute new tree
Broadcast new tree
Final tree
30
Application Details
•
•
•
•
•
•
Workflow = 37 DryadLINQ jobs
12 hours running time on 235 machines
More than 100,000 processes
More than 100 days of CPU time
Recovers from several failures daily
34,000 lines of .NET code
Windows SQM Data Analysis
SQM Service
Michal Strehovsky, Sivarudrappa Mahesh et al
DataMarts
V
SQM Client Client
Compressed
Storage
Front
FrontEnd
End
Cluster
V
V
Distributed
Execution
Engine
Reporting Portal
Data
Flow
Custom Schema/Storage
Datapoints
collected on
client and
Uploaded.
IIS Servers
Check validity &
concatenate
File movers
move incoming
data into
inexpensive
compressed
storage
Distributed
Excution Engine
queries data based
on user initiated adhoc queries or well
defined queries for
reporting
DataMarts
stores data for
reporting
purposes
Rich reports
created with
customization
capability
Data validation
and analysis
toolsets which
query raw data
directly
Extract data for
storage in
custom schema
with different
retention policy
The Language Integration Approach
• Single unified programming environment
– Unified data model and programming language
– Direct access to IDE and libraries
– Different from SQL, HIVE, Pig Latin
• Multiple layers of languages and data models
• Works out very well, but requires good
programming language supports
– LINQ extensibility: custom operators/providers
– .NET reflection, dynamic code generation, …
Combining with PLINQ
Query
DryadLINQ
subquery
PLINQ
The combination of PLINQ and DryadLINQ
delivers computation to every core in the
cluster
34
Acyclic Dataflow Graph
• Acyclic dataflow graph provides a very powerful
computation model
– Easy target for higher-level programming
abstractions such as DryadLINQ
– Easy expression of many data-parallel optimizations
• We designed Dryad to be general and flexible
– Programmability is less of a concern
– Used primarily to support higher-level programming
abstractions
– No major changes made to Dryad in order to
support DryadLINQ
Expectation Maximization (Gaussians)
• Generated by DryadLINQ
• 3 iterations shown
36
Decoupling of Dryad and DryadLINQ
• Separation of concerns
– Dryad layer concerns scheduling and fault-tolerance
– DryadLINQ layer concerns the programming model
and the parallelization of programs
– Result: powerful and expressive execution engine and
programming model
• Different from the MapReduce/Hadoop approach
– A single abstraction for both programming model and
execution engine
– Result: very simple, but very restricted execution
engine and language
Software Stack
Machine
Learning
Image
Processing
Graph
Analysis
Data
Mining
……
eScience
Applications
DryadLINQ
Dryad
CIFS/NTFS
SQL Servers
Azure DFS
Cosmos DFS
Cluster Services (Azure, HPC, or Cosmos)
Windows
Server
Windows
Server
Windows
Server
Windows
Server
38
Conclusion
• Single unified programming environment
– Unified data model and programming language
– Direct access to IDE and libraries
• An open and extensible system
– Many LINQ providers out there
• Existing ones: LINQ-to-XML, LINQ-to-SQL, PLINQ, …
• Very easy to write one for your app domain
– Dryad/DryadLINQ scales out all of them!
Availability
• Freely available for academic use
– http://connect.microsoft.com/DryadLINQ
– DryadLINQ source, Dryad binaries, documentation,
samples, blog, discussion group, etc.
• Will be available soon for commercial use
– Free, but no product support
Download