Visual Analytics and Human Subject Studies in Visualization Research Remco Chang Assistant Professor

advertisement
Remco Chang – KuperbergLab 2016
1
Visual Analytics and Human Subject
Studies in Visualization Research
Remco Chang
Assistant Professor
Computer Science, Tufts University
Remco Chang – KuperbergLab 2016
2
Visual Analytics Lab at Tufts
• VIS+Database
• (MIT)
– Big data systems
• Machine Learning
• (MIT Lincoln Lab)
– User-in-the-loop visual
analytics systems
• Modeling
• (Wisconsin)
– Comprehensible
modeling
• Perception
• (Northwestern)
• (U British Columbia)
– Perceptual modeling
• Psychology
• (Tufts Psych Dept)
– Individual difference
• “Storytelling”
• (Maine Medical Center)
– Medical risk
communication
Remco Chang – KuperbergLab 2016
3
Financial Fraud – A Case for Visual
Analytics
• Financial Institutions like Bank
of America have legal
responsibilities to report all
suspicious wire transaction
activities
• money laundering, supporting
terrorist activities, etc
• Data size: approximately
200,000 transactions per day
(73 million transactions per
year)
Remco Chang – KuperbergLab 2016
4
Financial Fraud – A Case Study for Visual
Analytics
• Problems:
• Automated approach can only
detect known patterns
• Bad guys are smart: patterns are
constantly changing
• Previous methods:
• 10 analysts monitoring and
analyzing all transactions
• Using SQL queries and
spreadsheet-like interfaces
• Limited time scale (2 weeks)
Remco Chang – KuperbergLab 2016
5
WireVis: Financial Fraud Analysis
• In collaboration with Bank of
America
• Visualizes 7 million transactions over
1 year
• A great problem for visual
analytics:
• Ill-defined problem (how does one
define fraud?)
• Limited or no training data (patterns
keep changing)
• Requires human judgment in the
end (involves law enforcement
agencies)
R. Chang et al., Scalable and interactive visual analysis of financial wire transactions for fraud detection. Information Visualization,2008.
R. Chang et al., Wirevis: Visualization of categorical, time-varying data from financial transactions. IEEE VAST, 2007.
Remco Chang – KuperbergLab 2016
6
WireVis: A Visual Analytics Approach
Heatmap View
(Accounts to Keywords
Relationship)
Search by Example
(Find Similar
Accounts)
Keyword Network
(Keyword
Relationships)
Multiple Temporal View
(Relationships over Time)
Remco Chang – KuperbergLab 2016
7
Evaluation
• Challenging – lack of ground
truth
• Two types of evaluations:
– Grounded Evaluation: real analysts, real data
• Find transactions that existing techniques can find
• Find new transactions that appear suspicious
– Controlled Evaluation: real analysts, synthetic data
• Find all injected threat scenarios
• Adoption and Deployment
Remco Chang – KuperbergLab 2016
8
Good Lessons Learned
• Analyst behavior
• 90% of time on Exploratory Data Analysis
(EDA)
• 10% on confirmation (CDA)
• Big data analysis == fast hypothesis
testing
• High Interactivity is key
• Users can wait to find the exact answer
Chang – KuperbergLab 2016
JordanRemco
Crouser
9
Interactive Visualization Systems
• Political Simulation
– Agent-based analysis
• Bridge Maintenance
– Exploring inspection
reports
• Biomechanical Motion
– Interactive motion
comparison
• Interactive Metric Learning
– DisFunction: learn a model
from projection
• High-D Data Exploration
– iPCA: Interactive PCA
R. Chang et al., Two Visualization Tools for Analysis of Agent-Based Simulations in Political Science. IEEE CG&A, 2012
Remco Chang – KuperbergLab 2016
10
Interactive Visualization Systems
• Political Simulation
– Agent-based analysis
• Bridge Maintenance
– Exploring inspection
reports
• Biomechanical Motion
– Interactive motion
comparison
• Interactive Metric Learning
– DisFunction: learn a model
from projection
• High-D Data Exploration
– iPCA: Interactive PCA
R. Chang et al., An Interactive Visual Analytics System for Bridge Management, Journal of Computer Graphics Forum, 2010.
Remco Chang – KuperbergLab 2016
11
Interactive Visualization Systems
• Political Simulation
– Agent-based analysis
• Bridge Maintenance
– Exploring inspection
reports
• Biomechanical Motion
– Interactive motion
comparison
• Interactive Metric Learning
– DisFunction: learn a model
from projection
• High-D Data Exploration
– iPCA: Interactive PCA
R. Chang et al., Interactive Coordinated Multiple-View Visualization of Biomechanical Motion Data, IEEE Vis (TVCG) 2009.
Remco Chang – KuperbergLab 2016
12
Eli Brown
Interactive Visualization Systems
• Political Simulation
– Agent-based analysis
• Bridge Maintenance
– Exploring inspection
reports
• Biomechanical Motion
– Interactive motion
comparison
• Interactive Metric Learning
– DisFunction: learn a model
from projection
• High-D Data Exploration
– iPCA: Interactive PCA
R. Chang et al., Dis-function: Learning Distance Functions Interactively, IEEE VAST 2011.
Remco Chang – KuperbergLab 2016
13
Interactive Visualization Systems
• Political Simulation
– Agent-based analysis
• Bridge Maintenance
– Exploring inspection
reports
• Biomechanical Motion
– Interactive motion
comparison
• Interactive Metric Learning
– DisFunction: learn a model
from projection
• High-D Data Exploration
– iPCA: Interactive PCA
R. Chang et al., iPCA: An Interactive System for PCA-based Visual Analytics, EuroVis 2009.
14
Remco Chang – KuperbergLab 2016
Remco Chang – KuperbergLab 2016
15
Individual Differences and Interaction
Pattern
• Existing research shows that all the following factors affect how
someone uses a visualization:
–
–
–
–
–
Spatial Ability
Experience (novice vs. expert)
Emotional State
Personality
Cognitive Workload/Mental
Demand
– Perception
– … and more
Peck et al., ICD3: Towards a 3-Dimensional Model of Individual Cognitive Differences. BELIV 2012
Peck et al., Using fNIRS Brain Sensing To Evaluate Information Visualization Interfaces. CHI 2013
Remco Chang – KuperbergLab 2016
16
Cognitive Load
Functional Near-Infrared Spectroscopy
• fNIRS
• a lightweight brain sensing technique
• measures mental demand (working
memory)
Evan Peck et al., Using fNIRS Brain Sensing to Evaluate Information Visualization Interfaces. CHI 2013.
17
Remco Chang – KuperbergLab 2016
Crowdsourcing Experiments in
Visualization Research:
Data, Perception, and Cognition
Remco Chang – KuperbergLab 2016
18
1: Collect User Generated Data
• Research Question: Can users’ interactions
predict:
• User’s performance in a task
• User’s individual differences
Eli Brown
• Need: large number of participants (>100)
Alvitta Ottley
Remco Chang – KuperbergLab 2016
19
Experiment: Finding Waldo
• Google-Maps style interface
• Left, Right, Up, Down, Zoom In, Zoom Out, Found
Brown et al., Finding Waldo: Learning about Users from their Interactions. IEEE VAST 2014
Remco Chang – KuperbergLab 2016
20
Pilot Visualization – Completion Time
Fast completion time
Slow completion time
Remco Chang – KuperbergLab 2016
21
Post-hoc Analysis Results
Mean Split (50% Fast, 50% Slow)
Data Representation
Classification Accuracy
Method
State Space
72%
SVM
Edge Space
63%
SVM
Sequence (n-gram)
77%
Decision Tree
Mouse Event
62%
SVM
Fast vs. Slow Split (Mean+0.5σ=Fast, Mean-0.5σ=Slow)
Data Representation
Classification Accuracy
Method
State Space
96%
SVM
Edge Space
83%
SVM
Sequence (n-gram)
79%
Decision Tree
Mouse Event
79%
SVM
Remco Chang – KuperbergLab 2016
22
Predicting a User’s Personality
External Locus of Control
Internal Locus of Control
Ottley et al., How locus of control influences compatibility with visualization style. IEEE VAST , 2011.
Ottley et al., Understanding visualization by understanding individual users. IEEE CG&A, 2012.
Remco Chang – KuperbergLab 2016
23
Predicting Users’ Personality Traits
Predicting user’s
“Extraversion”
Linear SVM
Accuracy: ~60%
• Noisy results:
• “Extraversion”, “Neuroticism”, and “Locus of Control” at ~60% accuracy.
Remco Chang – KuperbergLab 2016
24
Lessons Learned
• Log everything!
• Mouse movement, click, time
stamp, etc.
• Useful for manual removal of
“bad” data
• E.g. subject leaves for 5 minutes
External Locus of Control
• Mechanism:
• Store in Javascript
• Send in batch via PHP on “next
page”
• User doesn’t mind waiting
• PHP writes to server in plain text
• CSV or JSON
Internal Locus of Control
Remco Chang – KuperbergLab 2016
25
2: Perceptual Studies
• Research Question: Can we model perception
using MTurk?
• Replicate laboratory study
• Re: Darren’s talk on Tuesday
Lane Harrison
• Extend to other conditions
• Need:
• Large number of judgments (200,000)
• Large number of Turkers (> 1,500)
Harrison et al., Ranking Visualization Effectiveness Using Weber's Law. IEEE InfoVis 2014
Fumeng Yang
26
Remco Chang – KuperbergLab 2016
27
Remco Chang – KuperbergLab 2016
28
Remco Chang – KuperbergLab 2016
29
Remco Chang – KuperbergLab 2016
Remco Chang – KuperbergLab 2016
30
Another Experiment
Imagine yourself in a dark room….
31
Remco Chang – KuperbergLab 2016
32
Remco Chang – KuperbergLab 2016
33
Remco Chang – KuperbergLab 2016
34
Remco Chang – KuperbergLab 2016
Remco Chang – KuperbergLab 2016
35
Perceptual Modeling
• Weber’s Law (mid 1800s)
• Low-level perceptual discrimination (sound, touch, taste, brightness,
etc.)
Change in Intensity
Perceived Difference
𝑑𝑆
𝑑𝑃 = π‘˜
𝑆
Weber’s Fraction
(via experiments)
Intensity of the Stimulus
Remco Chang – KuperbergLab 2016
36
Perceptual Modeling
• Weber’s Law (mid 1800s)
• Low-level perceptual discrimination (sound, touch, taste, brightness,
etc.)
𝑑𝑆
𝑑𝑃 = π‘˜
𝑆
Given a fixed stimulus 𝑆, the smallest of 𝑑𝑆 that
can be perceived by humans is known as the
“Just Noticeable Difference”, or JND
Remco Chang – KuperbergLab 2016
37
Replication Study
• Replicated using MTurk Ron Rensink’s experiment in
2010 which shows the relationship between JND and
correlation (r) is linear and follows the Weber’s Law
Remco Chang – KuperbergLab 2016
38
Our Question...
worse
If the perception of
correlation in
scatterplots follows
Weber’s law...
better
Remco Chang – KuperbergLab 2016
39
worse
What does the
perception of correlation
in other charts look like?
better
40
Remco Chang – KuperbergLab 2016
41
Remco Chang – KuperbergLab 2016
42
Remco Chang – KuperbergLab 2016
43
Remco Chang – KuperbergLab 2016
44
Remco Chang – KuperbergLab 2016
Remco Chang – KuperbergLab 2016
more precise
less precise
45
46
Remco Chang – KuperbergLab 2016
The perception of correlation
in every tested chart can be modeled using
Weber’s law.
47
Remco Chang – KuperbergLab 2016
48
Remco Chang – KuperbergLab 2016
Ranking Visualizations of Correlation
Remco Chang – KuperbergLab 2016
49
Lessons Learned
• MTurk is good for perceptual studies
• Slightly higher variance
• Overall trend and values hold
• Useful for Between-Subject study
• Check device/browser type (disallow
handheld devices)
• Check screen resolution
• window.screen.width
• window.screen.availWidth
Remco Chang – KuperbergLab 2016
50
3: Cognitive Studies
• Research Question: Do individual differences
affect people’s ability to use visualizations?
• Priming emotion
• Priming locus of control
Alvitta Ottley
• Need:
• Large number of participants (~1,000)
• Long complicated study design (~30 minutes)
• Speed (and accuracy) matter
Lane Harrison
Remco Chang – KuperbergLab 2016
51
What is Priming?
Remco Chang – KuperbergLab 2016
52
Verbal Priming
• For example, to make someone have a more external LOC
“We know that one of the things that influence how well you
can do everyday tasks is the number of obstacles you face on a
daily basis. If you are having a particularly bad day today, you
may not do as well as you might on a day when everything
goes as planned. Variability is a normal part of life and you
might think you can’t do much about that aspect. In the space
provided below, give 3 examples of times when you have felt
out of control and unable to achieve something you set out to
do. Each example must be at least 100 words long.”
Remco Chang – KuperbergLab 2016
53
Priming Emotion on Visual Judgment
Harrison et al., Influencing Visual Judgment Through Affective Priming, CHI 2013
Remco Chang – KuperbergLab 2016
54
Background: Locus of Control and VIS
• Task (Inferential): “There’s something interesting about folder
XXX. Find another folder that shares a similar pattern”
V1
V2
V3
Ziemkiewicz et al., How Locus of Control Influences Compatibility with Visualization Style, IEEE VAST 2011.
V4
Remco Chang – KuperbergLab 2016
55
Locus of Control and VIS
• When with list view compared to containment view, internal
LOC users are:
• Faster (by 70%)
• More Accurate (by 34%)
• The speed improvement is about 2 minutes (116 seconds)
Remco Chang – KuperbergLab 2016
56
Results: Average Primed to be Internal
Performance
Good
External LOC
Average LOC
Internal LOC
Poor
Visual Form
List-View
Containment
Ottley et al., Manipulating and Controlling for Personality Effects on Visualization Tasks, Information Visualization, 2013
Remco Chang – KuperbergLab 2016
57
Results: Internal Primed to be External
Performance
Good
External LOC
Average LOC
Internal LOC
Poor
Visual Form
List-View
Containment
Ottley et al., Manipulating and Controlling for Personality Effects on Visualization Tasks, Information Visualization, 2013
Remco Chang – KuperbergLab 2016
58
Results: External Primed to be Internal
Performance
Good
External LOC
Average LOC
Internal LOC
Poor
Visual Form
List-View
Containment
Ottley et al., Manipulating and Controlling for Personality Effects on Visualization Tasks, Information Visualization, 2013
Remco Chang – KuperbergLab 2016
59
Results
Remco Chang – KuperbergLab 2016
60
Lessons Learned
• Pay minimum wage
• 30 minutes ~= $3
• IRB requires that we pay
everyone
• Bonus on time and accuracy
• Disable the “Back” button
• Estimate about 20%** of all
resulting data will be “bad” in
some way, e.g.
• Failing pre-task or consistency
check
• Completed way too fast
• Recruit from North America only
** The percentage has decreased in the past few years. Used to be around 40%
Remco Chang – KuperbergLab 2016
61
Conclusion
• Crowdsourcing works for VIS experiments
• Collect User Generated Data
• Conduct Perceptual Experiments
• Run Cognitive Studies
• Many “gotchas” but most of them are avoidable
Remco Chang – KuperbergLab 2016
62
Questions?
remco@cs.tufts.edu
Download