SYSTEMS SUPPORT FOR GRAPHICAL LEARNING 9/18/2014 Ken Birman

advertisement
CS6410 Fall 2014
SYSTEMS SUPPORT FOR
GRAPHICAL LEARNING
9/18/2014
Ken Birman
1
Graphical models and applications
2

Artificial intelligence and machine learning is the
core technology in many modern cloud settings
 Support
for social networking mechanisms
 Creating product placement recommendations
 Understanding the flow of “influence” within communities

Graphical processing can also matter in systems
 Understand
what to cache and what not to cache
 Learning common patterns to optimize
CS5412 Spring 2014 (Cloud Computing: Birman)
What makes this hard?
3

Prior generation of solutions was too general
 Programming
languages can do anything, but they
aren’t at all specialized for graph structured data
 Database systems are awesome for tabular data but
much less optimized for graphical data

There is also an issue of scale
 We’re
good at what can be done on one computer
 But a company like Facebook has billions of users and
their infrastructure runs on massive data centers
CS5412 Spring 2014 (Cloud Computing: Birman)
Today’s papers
4

TAO paper (I’ll start with this) gives a sense of the
challenge Facebook confronts
Like an entire distributed operating system
 But the whole role of the solution is to manage graphical
data and support queries against it
 Massive loads and surreal scale


Things to notice?
How does the architecture of the solution reflect the special
environment in which it runs?
 How did they identify and optimize the critical paths?

CS5412 Spring 2014 (Cloud Computing: Birman)
Dryad/LINQ
5

Here we see two concepts combined
 At
Microsoft, LINQ has become very popular
 It embeds a kind of query processing into C# code

Dryad takes this one step further
 Given
a LINQ expression, Dryad can run it on a
distributed “computing engine” of their own design
 Idea is to obtain massive parallelism
CS5412 Spring 2014 (Cloud Computing: Birman)
Basic architecture of Dryad
6
CS5412 Spring 2014 (Cloud Computing: Birman)
Execution of a LINQ expression
7
CS5412 Spring 2014 (Cloud Computing: Birman)
A join, done in two ways
8
CS5412 Spring 2014 (Cloud Computing: Birman)
A join, done in two ways
9
CS5412 Spring 2014 (Cloud Computing: Birman)
MapReduce in Dryad/LINQ
10
CS5412 Spring 2014 (Cloud Computing: Birman)
Other major systems in this space
11



Check out
http://en.wikipedia.org/wiki/Graph_database
They list 50 or so graphical databases and
processing systems
Some popular ones in research settings are Pregel
(from Google), GraphLab (CMU) and Vowpal
Wabbit (“Fast Learning”) (Yahoo)
CS5412 Spring 2014 (Cloud Computing: Birman)
Take aways
12

Computer systems need to be responsive to
 Styles
of use (what our “customers” are doing)
 Common patterns of load (optimize for this case)


In today’s major cloud computing settings, graphical
data and graphical learning solutions are becoming
a highly dominant form of load and focus
Computer systems need to evolve to track this need
CS5412 Spring 2014 (Cloud Computing: Birman)
Download