n o ti ra

advertisement
Information Interfaces Research Group
School of Interactive Computing
Georgia Institute of Technology
John Stasko
Visualization for
Information Analysis and Exploration
Sept. 17, 2008
2
• Get out pencil & paper
Exercise
3
– news, sports, financial, purchases, etc...
• Computers, internet and web give people
access to an incredible amount of data
– There simply is more “stuff”
• Society is more complex
Data Explosion
4
– How do we avoid being overwhelmed?
– How do we harness this data in decisiondecisionmaking processes?
– How do we make sense of the data?
• Confound: How to make use of the data
Data Overload
5
• Transform the data into information
(understanding, insight) thus making it
useful to people
The Challenge
6
• Visualization of data helps people
understand it better
Premise of my Work
7
– Much done preattentively, ie, without
thought
– Strong pattern recognition
– Parallel
– ~100 MB/s
• Highest bandwidth sense
Human Vision
8
• From [Card, Mackinlay Shneiderman ‘98]
– “The use of computercomputer-supported, interactive
visual representations of data to amplify
cognition.”
• Definition
Visualization
9
– Insight: discovery, decision making,
explanation, analysis, exploration, learning
• “The purpose of visualization is insight,
not pictures”
– Internalize an understanding
– Form a mental image of something
• Really is a cognitive process
• Often thought of as process of creating a
graphic or an image
Visualization
10
Larkin & Simon ’87
Card, Mackinlay, Shneiderman ‘98
– Role of external world in thinking and reason
• External cognition aid
• Pattern matching
• Cognition → Perception
– Provide a frame of reference, a temporary
storage area
• Visuals help us think
Main Idea
11
– Want to know what questions to ask
– Don’t have a priori questions
– Don’t know what you’re looking for
• Visualization most useful in exploratory
data analysis
– Data mining, DB queries, machine learning…
• Many other techniques for data analysis
When to Apply?
12
• “A picture is worth a thousand words”
• “Seeing is believing”
• “I see what you’re saying”
Part of our Culture
13
Some quick (static) examples…
14
E. Tufte, Visual Display of Quant Info
NYC Weather
2220 numbers
London Subway
15
www.thetube.com
True Geography
16
www.kottke.org/plus/misc/images/tubegeo.gif
17
Easy Walking Lines Added
rodcorp.typepad.com/photos/art_2003/tube_walklines_final_lmfaint.html
Atlanta Journal
April 30, 2000
Atlanta Flight Traffic
18
19
InfoVis ‘07
20
Reinforce my point with two examples
Questions:
21
Which cereal has the most/least potassium?
Is there a relationship between potassium and fiber?
If so, are there any outliers?
Which manufacturer makes the healthiest cereals?
Fiber
Potassium
22
23
• What if I read the data to you?
• What if you could only see one cereal’s
data at a time? (e.g. some websites)
Even Tougher?
24
http://astro.swarthmore.edu/astro121/anscombe.html
• Coefficient of determination = 0.67
• Correlation coefficient = 0.82
• Residual sums of squared errors (about the regression line)
= 13.75
• Regression sums of squared errors (variance accounted for by x)
= 27.5
• Sums of squared errors (about the mean) = 110.0
• Equation of the leastleast-squared regression line is: y = 3 + 0.5x
• Mean of the y values = 7.5
• Mean of the x values = 9.0
Four Data Sets
The Data Sets
25
1
10.0, 8.04
8.0, 6.95
13.0, 7.58
9.0, 8.81
11.0, 8.33
14.0, 9.96
6.0, 7.24
4.0, 4.26
12.0,10.84
7.0, 4.82
5.0, 5.68
The Values
2
10.0,9.14
8.0,8.14
13.0,8.74
9.0,8.77
11.0,9.26
14.0,8.10
6.0,6.13
4.0,3.10
12.0,9.13
7.0,7.26
5.0,4.74
26
3
10.0, 7.46
8.0, 6.77
13.0,12.74
9.0, 7.11
11.0, 7.81
14.0, 8.84
6.0, 6.08
4.0, 5.39
12.0, 8.15
7.0, 6.42
5.0, 5.73
4
8.0, 6.58
8.0, 5.76
8.0, 7.71
8.0, 8.84
8.0, 8.47
8.0, 7.04
8.0, 5.25
19.0,12.50
8.0, 5.56
8.0, 7.91
8.0, 6.89
27
• What did you put on paper?
Revisit Starting Exercise
• Visual Analytics
28
• Information Visualization
Two Related Disciplines
29
• Area emerged approximately 1990
– Statistics, databases, software, …
• Using interactive computer visualizations
to represent and communicate abstract
data
Information Visualization
30
– Interaction is crucial
– Challenges of evaluation
– InfoVis for the Masses
• Recent research trends
Information Visualization
31
• Area emerged approximately 2005
• InfoVis++
• Formal: The science of analytical
reasoning facilitated by interactive visual
interfaces
• Informal: Using visual representations to
help make decisions
Visual Analytics
• Positioning for an Enduring
Success
• Moving Research Into
Practice
• Production, Presentation, and
Dissemination
• Data Representations
and Transformations
• Science of Visual
Representations
and Interactions
• Science of Analytical
Reasoning
• Challenges
Overview of the R&D Agenda
32
• Decision sciences
33
• Comunications: Capture, Illustrate and present a message
• Cognitive and Perceptual Sciences
• Ontology, semantics, NLP, extraction, synthesis, …
• Knowledge representation, management and discovery
• Applied Mathematics
• Geospatial and Temporal Sciences
• Statistics, data representation and statistical graphics
Visual Analytics: Beyond InfoVis
Information
Visualization
~1990
Academic Context
34
Visual
Analytics
~2005
IEEE VAST
35
IEEE InfoVis
36
“A motivated , continuous effort to understand
connections (which can be among people,
places, and events) in order to anticipate
their trajectories and act effectively.”
– Klein, Moon and Hoffman
Sensemaking
37
• Visualization for Investigative Analysis
across Document Collections
Jigsaw
38
Gennadiy Stepanov
Sarah Williams
Neel Parekh
Kanupriyah Singhal
Carsten Görg
Zhicheng Liu
Vasili Pantazopoulos
+ 4 new students
The Jigsaw Team
Pirolli & Card, ICIA ‘05
39
40
• Analysts’ span of attention for evidence
and hypotheses
• Cost structure of scanning and selecting
items for further attention
Pain Points
Documents/
case reports
41
Blogs
DBs
• Help investigative analysts discover
plans, plots and threats embedded across
the individual documents in large
document collections
Problem Addressed
Example Document
42
43
• Thesis: A plot/threat within the
documents will involve a set of entities in
coordination
– Person, place, organization, phone number,
date, license plate, etc.
• Entities within the documents
Our Focus
44
• Not our main research focus –
Collaborate with or use tools from others
– Crucial for our work
• Must identify and extract entities from
plain text documents
Entity Identification
Entities Identified
45
46
– The more documents they appear in
together, the stronger the connection
– Two entities are connected if they appear in
a document together
• Connection definition:
• Entities relate/connect to each other to
make a larger “story”
Connections
“Putting the pieces together”
47
• User actions generate events
that are transmitted to and
(possibly) reflected in other
views
• Views are highly interactive and
coordinated
• Multiple visualizations (views) of
documents, entities, & their connections
Jigsaw
System Views
48
49
The Need for Pixels
50
Demo
Console
51
Document View
52
List View
53
Graph View
54
Scatterplot View
55
Calendar View
56
Report Cluster View
57
Timeline View
58
Shoebox
59
60
• Transitioning system to real clients
Trial Use
• Reliability/uncertainty
• Other types of data
• Themes/concepts
• Enhanced evidence
marshalling
61
• Connectivity search
• Collaborative version
• Scalability issues
• Present/browse
investigation history
• Geospatial View
• Evaluation
• Deployment
• Display wall?
• Web search & situational
awareness
• Wikipedia & Intellipedia
• Entity Identification
Future Work
62
• Including flexible, useful interaction is
one of the best ways to do this
– Not to just illustrate and reconfirm existing
knowledge
• Design your visualization systems and
tools to facilitate analysis and exploration
Take Away Point
63
• http://www.gvu.gatech.edu/ii
To Learn More
64
• Some slides in this presentation
borrowed from overviews of visual
analytics by Jim Thomas, NVAC Director
Acknowledgment
65
• Supported by NSF IISIIS-0414667
• Work conducted as part of the
Southeastern Regional Visualization and
Analytics Center, supported by DHS and
NVAC
Acknowledgments
• Questions?
66
• Thanks for your attention!
End
Download