Progressive Visual Analytics! User-Driven Visual Exploration! of In-Progress Analytics Chad Stolper Georgia Tech Adam Perer IBM T.J. Watson Research Center David Gotz UNC Chapel Hill Visual Analytics Workflow 2 Visual Analytics Workflow 3 Visual Analytics Workflow 4 Visual Analytics Workflow 5 Visual Analytics Workflow 6 Visual Analytics Workflow 7 Visual Analytics Workflow 8 Visual Analytics Workflow 9 Visual Analytics Workflow ! d a B 10 Visual Analytics Workflow ! d a B 11 Visual Analytics Workflow ! d a B y Ver 12 1. Increasingly large quantities of data Increasingly complex analytic algorithms https://www.flickr.com/photos/veggiefrog/3435380297/ 13 1. Increasingly large quantities of data 2. Increasingly complex analytics https://www.flickr.com/photos/veggiefrog/3435380297/ 14 1. Increasingly large quantities of data 2. Increasingly complex analytics https://www.flickr.com/photos/veggiefrog/3435380297/ 15 Batch Visual Analytics Workflow 16 Batch Visual Analytics Workflow 17 Batch Visual Analytics Workflow 18 Batch Visual Analytics Workflow 19 Batch Visual Analytics Workflow 20 e v i s s e r og Batch Visual Analytics Workflow Pr 21 e v i s s e r og Batch Visual Analytics Workflow Pr 22 D. Fisher, I. Popov, S. Drucker, and m. c. schraefel, “Trust me, i’m partially right: incremental visualization lets analysts explore large datasets faster,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, New York, NY, USA, 2012, pp. 1673–1682. e v i s s e r og Batch Visual Analytics Workflow Pr 23 e v i s s e r og Batch Visual Analytics Workflow Pr When is this strategy viable? How do we design systems to take advantage of it? 24 Progressive Visual Analytics Progressive, User-Driven Analytics Progressive, Interactive Visualization 25 Progressive Visual Analytics Progressive, User-Driven Analytics Progressive, Interactive Visualization 26 Progressive Analytics Semantically-Meaningful Partial Results 27 Progressive Analytics Semantically-Meaningful Partial Results (Thomas-- Results Feedback) 28 Progressive Analytics Semantically-Meaningful Partial Results 29 The partial results are of the same form as the final results Progressive Analytics Semantically-Meaningful Partial Results 30 http://en.wikipedia.org/wiki/K-means_clustering Progressive Analytics Semantically-Meaningful Partial Results 31 http://en.wikipedia.org/wiki/K-means_clustering Progressive Analytics Semantically-Meaningful Partial Results 32 http://en.wikipedia.org/wiki/K-means_clustering Progressive Analytics Semantically-Meaningful Partial Results 33 http://en.wikipedia.org/wiki/K-means_clustering Progressive Analytics Semantically-Meaningful Partial Results 34 K-Means Logistic Regression Progressive Analytics Semantically-Meaningful Partial Results 35 K-Means Logistic Regression Progressive Visual Analytics Progressive, User-Driven Analytics Progressive, Interactive Visualization 36 User-Driven Analytics Incorporate Analyst Knowledge 37 User-Driven Analytics Incorporate Analyst Knowledge 38 User-Driven Analytics Incorporate Analyst Knowledge 39 User-Driven Analytics Incorporate Analyst Knowledge (Thomas– Result Control) 40 Progressive Visual Analytics Progressive, User-Driven Analytics Progressive, Interactive Visualization 41 Progressive Visual Analytics Progressive, User-Driven Analytics Progressive, Interactive Visualization 42 Progressive Visual Analytics Progressive, User-Driven Analytics Progressive, Interactive Visualization 43 Progressive Visualization Visualize the Partial Results 44 Progressive Visualization Visualize the Partial Results (Thomas– Results Feedback) 45 Progressive Visualization Analytic Visualization 46 Analyst Progressive Visualization Up-to-Date Information Analytic Visualization 47 Analyst Progressive Visualization Up-to-Date Information Analytic Visualization 48 Analyst Progressive Visualization Up-to-Date Information Analytic Visualization http://info.jmu.edu/dux/files/2013/01/chaos.jpg 49 Analyst Progressive Visualization Up-to-Date Information Analytic Visualization http://info.jmu.edu/dux/files/2013/01/chaos.jpg 50 Analyst http://www.historyforkids.org/learn/medieval/history/byzantine/justinian.jpg Progressive Visual Analytics Progressive, User-Driven Analytics Progressive, Interactive Visualization 51 Progressive Visualization Up-to-Date Information Analytic Visualization 52 Analyst Interactive Visualization Up-to-Date Information Analytic Visualization Analyst’s Domain Knowledge 53 Analyst Interactive Visualization Up-to-Date Information Analytic Visualization Analyst Analyst’s Domain Knowledge 54 (Thomas– Execution Control) Progressive Visual Analytics Progressive, User-Driven Analytics Progressive, Interactive Visualization 55 So what might such a system look like? 56 Electronic Medical Records 57 Electronic Medical Records Procedures Lab Tests Diagnoses Medications 58 Electronic Medical Records HF BB HF D Frequent Patterns Event Series Correlate with Outcomes 59 Electronic Medical Records HF D HF D HF BB HF D COPD HF HF BB D BB D Frequent Patterns Event Series Correlate with Outcomes 60 Electronic Medical Records HF D HF D HF BB HF D COPD HF HF BB D BB D Frequent Patterns Event Series Correlate with Outcomes 61 Electronic Medical Records HF D HF D HF BB HF D COPD HF HF BB D BB D Frequent Patterns Event Series Correlate with Outcomes 62 Electronic Medical Records HF D HF D HF BB HF D COPD HF HF BB D BB D Frequent Patterns Event Series Correlate with Outcomes 63 Electronic Medical Records What Works? (What Doesn’t Work?) HF D HF D HF BB HF D COPD HF HF BB D BB D Frequent Patterns Event Series Correlate with Outcomes 64 Electronic Medical Records What Works? (What Doesn’t Work?) HF D HF D HF BB HF D COPD HF HF BB D BB D Frequent Patterns Event Series Correlate with Outcomes 65 Sequential Pattern Mining Algorithm 66 J.Ayres, J.Flannick, J.Gehrke, and T.Yiu. Sequential Pattern mining using a bitmap representation. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’02, pages 429–435, New York, NY, USA, 2002. ACM. Sequential Pattern Mining Algorithm (Thomas– Dependent Subdivision) 67 J.Ayres, J.Flannick, J.Gehrke, and T.Yiu. Sequential Pattern mining using a bitmap representation. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’02, pages 429–435, New York, NY, USA, 2002. ACM. Sequential Pattern Mining Algorithm § Build a tree of every possible pattern • (up to a maximum length) § Prune the tree • 68 (using a support threshold) Sequential Pattern Mining Algorithm 50% Support (2 Patients) COPD HF § Build a tree of every possible pattern HF BB D BB D • HF HF D HF § Prune the tree BB HF BB HF BB BB D 69 (up to a maximum length) • (using a support threshold) Sequential Pattern Mining Algorithm 50% Support (2 Patients) COPD HF § Build a tree of every possible pattern HF BB D BB D • HF HF D HF § Prune the tree BB HF BB HF BB BB D 70 (up to a maximum length) • (using a support threshold) Sequential Pattern Mining Algorithm 50% Support (2 Patients) COPD HF § Build a tree of every possible pattern HF BB D BB D • HF HF D HF § Prune the tree BB HF BB HF BB BB D 71 (up to a maximum length) • (using a support threshold) Sequential Pattern Mining Algorithm Frequent Patterns § Depth-First Search Event Series § Outputs frequent patterns as each is found Correlate with Outcomes 72 Sequential Pattern Mining Algorithm Frequent Patterns § Depth-First Search Event Series § Outputs frequent patterns as each is found Correlate with Outcomes 73 Sequential Pattern Mining Algorithm Frequent Patterns § Depth-First Search Event Series § Outputs frequent patterns as each is found Correlate with Outcomes 74 Interactive Sequential Pattern Mining § Breadth-First Search Frequent Patterns Event Series Correlate with Outcomes Shorter patterns first Analyst prioritization § Outputs frequent patterns as each is found Manual Pruning 75 Interactive Sequential Pattern Mining § Breadth-First Search Frequent Patterns Event Series Correlate with Outcomes Shorter patterns first Analyst prioritization § Outputs frequent patterns as each is found Manual HF BB Pruning D COPD 76 BB D Interactive Sequential Pattern Mining § Breadth-First Search Frequent Patterns Event Series Correlate with Outcomes • Shorter patterns first Analyst prioritization § Outputs frequent patterns as each is found HF Manual Pruning 77 BB D COPD Interactive Sequential Pattern Mining § Breadth-First Search • Shorter patterns first Analyst prioritization http://thenextweb.com/wp-content/blogs.dir/1/files/2013/02/queue.jpg § Outputs frequent patterns as each is found Manual Pruning 78 Interactive Sequential Pattern Mining § Breadth-First Search • • http://thenextweb.com/wp-content/blogs.dir/1/files/2013/02/queue.jpg Shorter patterns first Analyst prioritization § Outputs frequent patterns as each is found Manual Pruning 79 Interactive Sequential Pattern Mining § Breadth-First Search • • HF BB D COPD Shorter patterns first Analyst prioritization § Outputs frequent patterns as each is found Manual Pruning 80 Interactive Sequential Pattern Mining § Breadth-First Search • • Shorter patterns first Analyst prioritization § Outputs frequent patterns as each is found § Manual Pruning 81 http://www.bluestonegarden.com/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/WOLF-Garten-Telescoping-Hedge-Shears-1.png Interactive Sequential Pattern Mining § Breadth-First Search • • COPD D Shorter patterns first Analyst prioritization § Outputs frequent patterns as each is found § Manual Pruning 82 Interactive Sequential Pattern Mining § Breadth-First Search • • COPD D Shorter patterns first Analyst prioritization § Outputs frequent patterns as each is found § Manual Pruning 83 Progressive Visual Analytics Progressive, User-Driven Analytics Progressive, Interactive Visualization 84 Progressive Visual Analytics Progressive, User-Driven Analytics Progressive, Interactive Visualization 85 86 HF 87 COPD D HF 88 COPD HF 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 ? ? 120 ? ? 121 ? 122 ? 123 ? 124 ? 125 126 127 So Does it Work? 128 So Does it Work? Medical Researchers, University Hospital of North Norway 129 So Does it Work? AMIA 2014 Skrovseth, Perer, Delaney, Revaug, Lindsetmo, Augestad "Detecting Novel Associations for Surgical Hospital Readmissions in Large Datasets by Interactive Visual Analytics” 130 J Contributions § A precise definition and design guidelines for Progressive Visual Analytics § An example Progressive Visual Analytics system, Progressive Insights § (In the paper) A case study of using Progressive Insights for exploring frequent patterns in EHR 131 Contributions § A precise definition and design guidelines for Progressive Visual Analytics § An example Progressive Visual Analytics system, Progressive Insights § (In the paper) A case study of using Progressive Insights for exploring frequent patterns in EHR 132 Contributions § A precise definition and design guidelines for Progressive Visual Analytics § An example Progressive Visual Analytics system, Progressive Insights § (In the paper) A case study of using Progressive Insights for exploring frequent patterns in EHR 133 Contributions § A precise definition and design guidelines for Progressive Visual Analytics § An example Progressive Visual Analytics system, Progressive Insights § (In the paper) A case study of using Progressive Insights for exploring frequent patterns in EHR 134 Adam Perer Chad Stolper IBM T.J. Watson Research Center Georgia Tech chadstolper@gatech.edu David Gotz UNC Chapel Hill 135 Thank You! Adam Perer Chad Stolper IBM T.J. Watson Research Center Georgia Tech chadstolper@gatech.edu David Gotz UNC Chapel Hill 136 Questions? Adam Perer Chad Stolper IBM T.J. Watson Research Center Georgia Tech chadstolper@gatech.edu David Gotz UNC Chapel Hill 137 Questions? Adam Perer Chad Stolper IBM T.J. Watson Research Center Georgia Tech chadstolper@gatech.edu David Gotz UNC Chapel Hill 138 Progressive Visual Analytics Seven Design Guidelines 139 Progressive Visual Analytics Systems Analytics should… 1. Provide increasingly meaningful partial results as the algorithm executes 2. Allow users to focus the algorithm to subspaces of interest 3. Allow users to ignore irrelevant subspaces 140 Progressive Visual Analytics Systems Visualizations should… 4. Minimize distractions by not changing views excessively 5. Provide cues to indicate where new results have been found 6. Support on-demand refresh 7. Provide an interface to specify where the analytics should focus and ignore 141