CS6410 Fall 2014 SYSTEMS SUPPORT FOR GRAPHICAL LEARNING 9/18/2014 Ken Birman 1 Graphical models and applications 2 Artificial intelligence and machine learning is the core technology in many modern cloud settings Support for social networking mechanisms Creating product placement recommendations Understanding the flow of “influence” within communities Graphical processing can also matter in systems Understand what to cache and what not to cache Learning common patterns to optimize CS5412 Spring 2014 (Cloud Computing: Birman) What makes this hard? 3 Prior generation of solutions was too general Programming languages can do anything, but they aren’t at all specialized for graph structured data Database systems are awesome for tabular data but much less optimized for graphical data There is also an issue of scale We’re good at what can be done on one computer But a company like Facebook has billions of users and their infrastructure runs on massive data centers CS5412 Spring 2014 (Cloud Computing: Birman) Today’s papers 4 TAO paper (I’ll start with this) gives a sense of the challenge Facebook confronts Like an entire distributed operating system But the whole role of the solution is to manage graphical data and support queries against it Massive loads and surreal scale Things to notice? How does the architecture of the solution reflect the special environment in which it runs? How did they identify and optimize the critical paths? CS5412 Spring 2014 (Cloud Computing: Birman) Dryad/LINQ 5 Here we see two concepts combined At Microsoft, LINQ has become very popular It embeds a kind of query processing into C# code Dryad takes this one step further Given a LINQ expression, Dryad can run it on a distributed “computing engine” of their own design Idea is to obtain massive parallelism CS5412 Spring 2014 (Cloud Computing: Birman) Basic architecture of Dryad 6 CS5412 Spring 2014 (Cloud Computing: Birman) Execution of a LINQ expression 7 CS5412 Spring 2014 (Cloud Computing: Birman) A join, done in two ways 8 CS5412 Spring 2014 (Cloud Computing: Birman) A join, done in two ways 9 CS5412 Spring 2014 (Cloud Computing: Birman) MapReduce in Dryad/LINQ 10 CS5412 Spring 2014 (Cloud Computing: Birman) Other major systems in this space 11 Check out http://en.wikipedia.org/wiki/Graph_database They list 50 or so graphical databases and processing systems Some popular ones in research settings are Pregel (from Google), GraphLab (CMU) and Vowpal Wabbit (“Fast Learning”) (Yahoo) CS5412 Spring 2014 (Cloud Computing: Birman) Take aways 12 Computer systems need to be responsive to Styles of use (what our “customers” are doing) Common patterns of load (optimize for this case) In today’s major cloud computing settings, graphical data and graphical learning solutions are becoming a highly dominant form of load and focus Computer systems need to evolve to track this need CS5412 Spring 2014 (Cloud Computing: Birman)