DEFOG A System for Data‐Backed Visual Composi=on Lauro Lins, David Koop, Juliana Freire, Claudio Silva SCI Ins=tute, University of Utah August 2010 Mo=va=on Vistrails Screenshots + Adobe Illustrator Python + R Python + R Python + R + GraphViz + Inkscape Python + R SSDBM 2008 Mo=va=on Vistrails Screenshots + Adobe Illustrator Hypothesis Python + R Python + R Visualiza=on Ideas Python + R too much =me here! SSDBM 2008 Python + R + Visualiza=on GraphViz + Inkscape The DEFOG System • Streamlines the process of crea=ng visualiza=ons • Supports the composi=on of data‐driven visualiza=ons • Allows users to explore crea%ve data visualiza%ons that are either hard or impossible to express with other tools • Users can compose novel representa=ons by combining exis=ng visualiza=ons techniques • Here is an example… Analyzing WWW2010 Submissions Input = XML file with WWW 2010 submissions Analyzing WWW2010 Submissions Web Services and ServiceOriented Computing Bridging Structured and Unstructured Data User Interfaces and Rich Interaction Software Architecture and Infrastructure Social Networks and Communities Data Mining and Machine Learning Semantic Web Internet Monetization Security and Privacy Networking and Mobility Performance, Scalability and Rich Media Availability Search Excel DEFOG Analyzing WWW2010 Submissions Web Services and ServiceOriented Computing Bridging Structured and Unstructured Data User Interfaces and Rich Interaction Software Architecture and Infrastructure Social Networks and Communities Data Mining and Machine Learning Semantic Web Internet Monetization Security and Privacy Networking and Mobility Performance, Scalability and Rich Media Availability Search DEFOG used to create visualizaEons and compose them for presentaEon WWW 2010: Area Overlap DEFOG Features • Extensible set of data visualiza=on techniques – Sta=s=cal plots (scaZer plots, histograms, etc) – Graph drawing – Add your own! • Flexible combina=on of visualiza=ons and techniques: – Side by side, nested, linked – E.g., graph inside a plot, graph of plots, plots with elements connected to elements in other plots • Integra=on of visual manipula=on and programming • Work in progress: – Provenance capture – Scalability‐‐‐support millions of objects DEFOG combines… Python • • • Language to represent data and computa=ons Used to configure the “face programs” of the scene elements Python console to inspect anything at any=me Tableau • DEFOG • • • DEFOG adopts the drag‐and‐drop approach of Tableau to express data manipula=on and visualiza=on Tableau can be seen as a “face program” of DEFOG DEFOG combines free drawing with data analysis and visualiza=on Opera=ons supported by drawing tools are available in DEFOG (e.g. copy‐and‐paste, transform, free drawing, fine tuning) DEFOG has been used to help scienEsts make sense of their data Familial Associa=on Between Cancer Sites • Collaborators: Lisa Canon‐Albright and Craig Teerlink Gene=c Epidemiology, University of Utah DEFOG Excel Spreadsheet JNCI Submission Cancer Sites Network Biochemical Pathways • Collaborators: Elizabeth Skovran and Mary Lidstrom Lidstrom Laboratory, University of Washington DEFOG Excel Spreadsheet Data Analysis Type 2 Diabetes Research • Collaborators: Robert Cooksey and Donald McClain Division of Endocrinology and Metabolism, University of Utah “Our laboratory studies the molecular basis for type 2 diabetes as well as the role iron plays in development of the diabe%c phenotype… We have run ~136 mice producing over 2.8 x 10‐7 data points. DEFOG Custom Text File We have started using DEFOG to visualize these data. This has led to one novel observa%on. That mice shiQ from faRy acid oxida%on during the day (inac%ve period) to glucose oxida%on at night (ac%ve period) as measured by the RER fluctua%on. Highligh%ng the circadian shiQ in nutrient u%liza%on during a 24 hour cycle.” Robert Cooksey DEFOG vs. Other VisualizaEon Tools Visualiza=on Programming Languages Protovis Processing MicrosoQ Research Vedea Visualiza=on Programming Languages • These languages provide constructs that make it easier generate visualiza=ons than general programming languages Protovis • But visualiza=ons are s=ll created programma=cally • They can be integrated with DEFOG Processing MicrosoQ Research Vedea Tableau • Database visualiza=on system • Drag‐and‐drop interface to explore data visualiza=ons – Hides the underlying query language from the users • Supports efficiently data explora=on through “basic” visualiza=ons • Doesn’t support rich visualiza=ons which are needed to understand more complicated datasets (the norm in science) • The Tableau way of expressing “basic” visualiza=ons can be integrated into DEFOG as a “face program” MSR Live Labs Pivot • Visual interac=on with a collec=on of records (all of the same type?) • Poten=al to be the core of an interface to explore Web search results (not there yet) • Not suitable for exploratory tasks which involve heterogeneous data (not in tabular format) MSR NodeXL • Add network drawing and algorithms to Excel spreadsheets • Excel is widely used and network datasets (many already) in Excel can benefit from NodeXL Poten=al Synergy SQL Server Protovis NodeXL Excel DEFOG PowerPoint Word Thank you DEFOG and Provenance Analy=cs DEMO