Remco Chang – KuperbergLab 2016 1 Visual Analytics and Human Subject Studies in Visualization Research Remco Chang Assistant Professor Computer Science, Tufts University Remco Chang – KuperbergLab 2016 2 Visual Analytics Lab at Tufts • VIS+Database • (MIT) – Big data systems • Machine Learning • (MIT Lincoln Lab) – User-in-the-loop visual analytics systems • Modeling • (Wisconsin) – Comprehensible modeling • Perception • (Northwestern) • (U British Columbia) – Perceptual modeling • Psychology • (Tufts Psych Dept) – Individual difference • “Storytelling” • (Maine Medical Center) – Medical risk communication Remco Chang – KuperbergLab 2016 3 Financial Fraud – A Case for Visual Analytics • Financial Institutions like Bank of America have legal responsibilities to report all suspicious wire transaction activities • money laundering, supporting terrorist activities, etc • Data size: approximately 200,000 transactions per day (73 million transactions per year) Remco Chang – KuperbergLab 2016 4 Financial Fraud – A Case Study for Visual Analytics • Problems: • Automated approach can only detect known patterns • Bad guys are smart: patterns are constantly changing • Previous methods: • 10 analysts monitoring and analyzing all transactions • Using SQL queries and spreadsheet-like interfaces • Limited time scale (2 weeks) Remco Chang – KuperbergLab 2016 5 WireVis: Financial Fraud Analysis • In collaboration with Bank of America • Visualizes 7 million transactions over 1 year • A great problem for visual analytics: • Ill-defined problem (how does one define fraud?) • Limited or no training data (patterns keep changing) • Requires human judgment in the end (involves law enforcement agencies) R. Chang et al., Scalable and interactive visual analysis of financial wire transactions for fraud detection. Information Visualization,2008. R. Chang et al., Wirevis: Visualization of categorical, time-varying data from financial transactions. IEEE VAST, 2007. Remco Chang – KuperbergLab 2016 6 WireVis: A Visual Analytics Approach Heatmap View (Accounts to Keywords Relationship) Search by Example (Find Similar Accounts) Keyword Network (Keyword Relationships) Multiple Temporal View (Relationships over Time) Remco Chang – KuperbergLab 2016 7 Evaluation • Challenging – lack of ground truth • Two types of evaluations: – Grounded Evaluation: real analysts, real data • Find transactions that existing techniques can find • Find new transactions that appear suspicious – Controlled Evaluation: real analysts, synthetic data • Find all injected threat scenarios • Adoption and Deployment Remco Chang – KuperbergLab 2016 8 Good Lessons Learned • Analyst behavior • 90% of time on Exploratory Data Analysis (EDA) • 10% on confirmation (CDA) • Big data analysis == fast hypothesis testing • High Interactivity is key • Users can wait to find the exact answer Chang – KuperbergLab 2016 JordanRemco Crouser 9 Interactive Visualization Systems • Political Simulation – Agent-based analysis • Bridge Maintenance – Exploring inspection reports • Biomechanical Motion – Interactive motion comparison • Interactive Metric Learning – DisFunction: learn a model from projection • High-D Data Exploration – iPCA: Interactive PCA R. Chang et al., Two Visualization Tools for Analysis of Agent-Based Simulations in Political Science. IEEE CG&A, 2012 Remco Chang – KuperbergLab 2016 10 Interactive Visualization Systems • Political Simulation – Agent-based analysis • Bridge Maintenance – Exploring inspection reports • Biomechanical Motion – Interactive motion comparison • Interactive Metric Learning – DisFunction: learn a model from projection • High-D Data Exploration – iPCA: Interactive PCA R. Chang et al., An Interactive Visual Analytics System for Bridge Management, Journal of Computer Graphics Forum, 2010. Remco Chang – KuperbergLab 2016 11 Interactive Visualization Systems • Political Simulation – Agent-based analysis • Bridge Maintenance – Exploring inspection reports • Biomechanical Motion – Interactive motion comparison • Interactive Metric Learning – DisFunction: learn a model from projection • High-D Data Exploration – iPCA: Interactive PCA R. Chang et al., Interactive Coordinated Multiple-View Visualization of Biomechanical Motion Data, IEEE Vis (TVCG) 2009. Remco Chang – KuperbergLab 2016 12 Eli Brown Interactive Visualization Systems • Political Simulation – Agent-based analysis • Bridge Maintenance – Exploring inspection reports • Biomechanical Motion – Interactive motion comparison • Interactive Metric Learning – DisFunction: learn a model from projection • High-D Data Exploration – iPCA: Interactive PCA R. Chang et al., Dis-function: Learning Distance Functions Interactively, IEEE VAST 2011. Remco Chang – KuperbergLab 2016 13 Interactive Visualization Systems • Political Simulation – Agent-based analysis • Bridge Maintenance – Exploring inspection reports • Biomechanical Motion – Interactive motion comparison • Interactive Metric Learning – DisFunction: learn a model from projection • High-D Data Exploration – iPCA: Interactive PCA R. Chang et al., iPCA: An Interactive System for PCA-based Visual Analytics, EuroVis 2009. 14 Remco Chang – KuperbergLab 2016 Remco Chang – KuperbergLab 2016 15 Individual Differences and Interaction Pattern • Existing research shows that all the following factors affect how someone uses a visualization: – – – – – Spatial Ability Experience (novice vs. expert) Emotional State Personality Cognitive Workload/Mental Demand – Perception – … and more Peck et al., ICD3: Towards a 3-Dimensional Model of Individual Cognitive Differences. BELIV 2012 Peck et al., Using fNIRS Brain Sensing To Evaluate Information Visualization Interfaces. CHI 2013 Remco Chang – KuperbergLab 2016 16 Cognitive Load Functional Near-Infrared Spectroscopy • fNIRS • a lightweight brain sensing technique • measures mental demand (working memory) Evan Peck et al., Using fNIRS Brain Sensing to Evaluate Information Visualization Interfaces. CHI 2013. 17 Remco Chang – KuperbergLab 2016 Crowdsourcing Experiments in Visualization Research: Data, Perception, and Cognition Remco Chang – KuperbergLab 2016 18 1: Collect User Generated Data • Research Question: Can users’ interactions predict: • User’s performance in a task • User’s individual differences Eli Brown • Need: large number of participants (>100) Alvitta Ottley Remco Chang – KuperbergLab 2016 19 Experiment: Finding Waldo • Google-Maps style interface • Left, Right, Up, Down, Zoom In, Zoom Out, Found Brown et al., Finding Waldo: Learning about Users from their Interactions. IEEE VAST 2014 Remco Chang – KuperbergLab 2016 20 Pilot Visualization – Completion Time Fast completion time Slow completion time Remco Chang – KuperbergLab 2016 21 Post-hoc Analysis Results Mean Split (50% Fast, 50% Slow) Data Representation Classification Accuracy Method State Space 72% SVM Edge Space 63% SVM Sequence (n-gram) 77% Decision Tree Mouse Event 62% SVM Fast vs. Slow Split (Mean+0.5σ=Fast, Mean-0.5σ=Slow) Data Representation Classification Accuracy Method State Space 96% SVM Edge Space 83% SVM Sequence (n-gram) 79% Decision Tree Mouse Event 79% SVM Remco Chang – KuperbergLab 2016 22 Predicting a User’s Personality External Locus of Control Internal Locus of Control Ottley et al., How locus of control inο¬uences compatibility with visualization style. IEEE VAST , 2011. Ottley et al., Understanding visualization by understanding individual users. IEEE CG&A, 2012. Remco Chang – KuperbergLab 2016 23 Predicting Users’ Personality Traits Predicting user’s “Extraversion” Linear SVM Accuracy: ~60% • Noisy results: • “Extraversion”, “Neuroticism”, and “Locus of Control” at ~60% accuracy. Remco Chang – KuperbergLab 2016 24 Lessons Learned • Log everything! • Mouse movement, click, time stamp, etc. • Useful for manual removal of “bad” data • E.g. subject leaves for 5 minutes External Locus of Control • Mechanism: • Store in Javascript • Send in batch via PHP on “next page” • User doesn’t mind waiting • PHP writes to server in plain text • CSV or JSON Internal Locus of Control Remco Chang – KuperbergLab 2016 25 2: Perceptual Studies • Research Question: Can we model perception using MTurk? • Replicate laboratory study • Re: Darren’s talk on Tuesday Lane Harrison • Extend to other conditions • Need: • Large number of judgments (200,000) • Large number of Turkers (> 1,500) Harrison et al., Ranking Visualization Effectiveness Using Weber's Law. IEEE InfoVis 2014 Fumeng Yang 26 Remco Chang – KuperbergLab 2016 27 Remco Chang – KuperbergLab 2016 28 Remco Chang – KuperbergLab 2016 29 Remco Chang – KuperbergLab 2016 Remco Chang – KuperbergLab 2016 30 Another Experiment Imagine yourself in a dark room…. 31 Remco Chang – KuperbergLab 2016 32 Remco Chang – KuperbergLab 2016 33 Remco Chang – KuperbergLab 2016 34 Remco Chang – KuperbergLab 2016 Remco Chang – KuperbergLab 2016 35 Perceptual Modeling • Weber’s Law (mid 1800s) • Low-level perceptual discrimination (sound, touch, taste, brightness, etc.) Change in Intensity Perceived Difference ππ ππ = π π Weber’s Fraction (via experiments) Intensity of the Stimulus Remco Chang – KuperbergLab 2016 36 Perceptual Modeling • Weber’s Law (mid 1800s) • Low-level perceptual discrimination (sound, touch, taste, brightness, etc.) ππ ππ = π π Given a fixed stimulus π, the smallest of ππ that can be perceived by humans is known as the “Just Noticeable Difference”, or JND Remco Chang – KuperbergLab 2016 37 Replication Study • Replicated using MTurk Ron Rensink’s experiment in 2010 which shows the relationship between JND and correlation (r) is linear and follows the Weber’s Law Remco Chang – KuperbergLab 2016 38 Our Question... worse If the perception of correlation in scatterplots follows Weber’s law... better Remco Chang – KuperbergLab 2016 39 worse What does the perception of correlation in other charts look like? better 40 Remco Chang – KuperbergLab 2016 41 Remco Chang – KuperbergLab 2016 42 Remco Chang – KuperbergLab 2016 43 Remco Chang – KuperbergLab 2016 44 Remco Chang – KuperbergLab 2016 Remco Chang – KuperbergLab 2016 more precise less precise 45 46 Remco Chang – KuperbergLab 2016 The perception of correlation in every tested chart can be modeled using Weber’s law. 47 Remco Chang – KuperbergLab 2016 48 Remco Chang – KuperbergLab 2016 Ranking Visualizations of Correlation Remco Chang – KuperbergLab 2016 49 Lessons Learned • MTurk is good for perceptual studies • Slightly higher variance • Overall trend and values hold • Useful for Between-Subject study • Check device/browser type (disallow handheld devices) • Check screen resolution • window.screen.width • window.screen.availWidth Remco Chang – KuperbergLab 2016 50 3: Cognitive Studies • Research Question: Do individual differences affect people’s ability to use visualizations? • Priming emotion • Priming locus of control Alvitta Ottley • Need: • Large number of participants (~1,000) • Long complicated study design (~30 minutes) • Speed (and accuracy) matter Lane Harrison Remco Chang – KuperbergLab 2016 51 What is Priming? Remco Chang – KuperbergLab 2016 52 Verbal Priming • For example, to make someone have a more external LOC “We know that one of the things that influence how well you can do everyday tasks is the number of obstacles you face on a daily basis. If you are having a particularly bad day today, you may not do as well as you might on a day when everything goes as planned. Variability is a normal part of life and you might think you can’t do much about that aspect. In the space provided below, give 3 examples of times when you have felt out of control and unable to achieve something you set out to do. Each example must be at least 100 words long.” Remco Chang – KuperbergLab 2016 53 Priming Emotion on Visual Judgment Harrison et al., Influencing Visual Judgment Through Affective Priming, CHI 2013 Remco Chang – KuperbergLab 2016 54 Background: Locus of Control and VIS • Task (Inferential): “There’s something interesting about folder XXX. Find another folder that shares a similar pattern” V1 V2 V3 Ziemkiewicz et al., How Locus of Control Influences Compatibility with Visualization Style, IEEE VAST 2011. V4 Remco Chang – KuperbergLab 2016 55 Locus of Control and VIS • When with list view compared to containment view, internal LOC users are: • Faster (by 70%) • More Accurate (by 34%) • The speed improvement is about 2 minutes (116 seconds) Remco Chang – KuperbergLab 2016 56 Results: Average Primed to be Internal Performance Good External LOC Average LOC Internal LOC Poor Visual Form List-View Containment Ottley et al., Manipulating and Controlling for Personality Effects on Visualization Tasks, Information Visualization, 2013 Remco Chang – KuperbergLab 2016 57 Results: Internal Primed to be External Performance Good External LOC Average LOC Internal LOC Poor Visual Form List-View Containment Ottley et al., Manipulating and Controlling for Personality Effects on Visualization Tasks, Information Visualization, 2013 Remco Chang – KuperbergLab 2016 58 Results: External Primed to be Internal Performance Good External LOC Average LOC Internal LOC Poor Visual Form List-View Containment Ottley et al., Manipulating and Controlling for Personality Effects on Visualization Tasks, Information Visualization, 2013 Remco Chang – KuperbergLab 2016 59 Results Remco Chang – KuperbergLab 2016 60 Lessons Learned • Pay minimum wage • 30 minutes ~= $3 • IRB requires that we pay everyone • Bonus on time and accuracy • Disable the “Back” button • Estimate about 20%** of all resulting data will be “bad” in some way, e.g. • Failing pre-task or consistency check • Completed way too fast • Recruit from North America only ** The percentage has decreased in the past few years. Used to be around 40% Remco Chang – KuperbergLab 2016 61 Conclusion • Crowdsourcing works for VIS experiments • Collect User Generated Data • Conduct Perceptual Experiments • Run Cognitive Studies • Many “gotchas” but most of them are avoidable Remco Chang – KuperbergLab 2016 62 Questions? remco@cs.tufts.edu