What Researchers Want
Cody Dunne
Dept. of Computer Science and
Human-Computer Interaction Lab,
University of Maryland cdunne@cs.umd.edu
Links from this talk: bit.ly/stmwant
STM 3 rd Master Class
November 7-9, 2011 Adelphi, MD, USA
1
Researchers want to…
1. Find a specific paper
2. Explore a research area
3. Do retrospective analysis
4. Share their results
2
1. Find a specific paper
• Metadata or PDF?
• From memory (search)
• From reference list
– DOI/URL
– Search
3
2. Exploring a research area
• Foundations
• Emerging research topics
• State of the art/open problems
• Collaborations & relationships between
Communities
• Field evolution
• Easily understandable surveys
4
User requirements
• Control over the paper collection
– Choose custom subset via query, then iteratively drill down, filter, & refine
• Overview either as visualization or text statistics
– Orient within subset
• Easy to understand metrics for identifying interesting papers
– Ranking & filtering
• Create groups & annotate with findings
– Organize discovery process
– Share results
5
Action Science Explorer
• Bibliometric lexical link mining to create a citation network and citation context
• Network clustering and multi-document summarization to extract key points
• Potent network analysis and visualization tools www.cs.umd.edu/hcil/ase
6
7
Reference management & grouping
8
Citation network overview
Communities, outliers, invalid data
9
Statistics & visualization
• Network statistics
– Degree
– Betweenness
– Closeness
– Pagerank
• Attributes
– Year
– Downloads
– Citations
– References
10
Field evolution
11
Citation context & summarization
• Citation context
– Key contributions
– Critical reception
– Citations to subsequent/similar work
• Hyperlinked citations in text
– See surrounding context of citation
– View cited papers while reading
• Multi-document summarization
– Citation context
– Abstract
– Full text
12
3. Retrospective analysis
• Automatic collection & processing of bibliometric data
• Easy access to visual analytic tools for finding clusters, trends, outliers
• Communities for sharing data, tools, & results
13
STICK Project
• Scientific , data-driven way to track innovations
– Vs. current expert-based, time consuming approaches (e.g., Gartner’s Hype Cycle, tire track diagrams)
• Includes both concept and product forms
– Study relationships between
• Study the innovation ecosystem
– Organizations & people
– Both those producing & using innovations stick.ischool.umd.edu
14
Case study: tree visualization
• Problem: Traditional 2D node-link diagrams of trees become too large
• Solutions:
– Treemaps: Nested Rectangles
– Cone Trees: 3D Interactive Animations
– Hyperbolic Trees: Focus + Context
• Measures:
– Papers, articles, patents, citations,…
– Press releases, blog posts, tweets,…
– Users, downloads, sales,…
15
Treemaps: nested rectangles www.cs.umd.edu/hcil/treemap-history 16
Smartmoney MarketMap Feb 27, 2007 smartmoney.com/marketmap 17
Cone trees: 3D interactive animations
Robertson, G. G., Card, S. K., and Mackinlay, J. D., Information visualization using 3D interactive animation,
Communications of the ACM, 36, 4 (1993), 51-71.
Robertson, G. G., Mackinlay, J. D., and Card, S. K., Cone trees: Animated 3D visualizations of hierarchical information,
Proc. ACM SIGCHI Conference on Human Factors in Computing Systems, ACM Press, New York, (April 1991), 189-194.
Hyperbolic trees: focus & context
Lamping, J. and Rao, R., Laying out and visualizing large trees using a hyper-bolic space, Proc. 7th Annual ACM symposium on User Interface Software and Technology, ACM Press, New York (1994), 13-14.
Lamping, J., Rao, R., and Pirolli, P., A focus+context technique based on hy-perbolic geometry for visualizing large
Case study: tree visualization impact
TM =Treemaps
CT =Cone Trees
HT =Hyperbolic Trees
20
Case study: tree visualization citations
TM =Treemaps
CT =Cone Trees
HT =Hyperbolic Trees
21
Case study: business intelligence
Proquest News 2000-2009
Co-occurrence of concepts with organizations
Data Mining
• National Security Agency
• White House
• FBI
• AT&T
• American Civil Liberties Union
• Electronic Frontier Foundation
• Dept. of Homeland Security
• CIA
Year
22
Business
Intelligence
2000-2009
Matrix showing Co-
Occurrence of concepts and entities
23
Business
Intelligence
2000-2009:
(subset)
24
Business
Intelligence
2000-2009:
Data mining
• NSA
• CIA
• FBI
• White House
• Pentagon
• DOD
• DHS
• AT&T
• ACLU
• EFF
• Senate Judiciary
Committee
25
Business
Intelligence
2000-2009:
Tech1
• Yahoo
• Stanford
• Apple
Tech2
• IBM, Cognos
• Microsoft
• Oracle
Finance
• NASDAQ
• NYSE
• SEC
• NCR
• MicroStrategy
26
Business
Intelligence
2000-2009:
• Air Force
• Army
• Navy
• GSA
• UMD*
27
STICK Process
• Identify concepts
• Query data sources
• Processing
• Automatic entity recognition
• Crowd-sourced verification
• Co-occurrence networks
• Visualizing & analyzing
• Overall statistics
• Co-occurrence networks
• Network evolution
• Sharing results
•
•
•
News
Dissertation
Academic
• Patent
• Blogs
28
4. Sharing results
• Easily usable metadata (BibTeX, EndNote, etc.)
• Collaborative authoring
• Online communities
29
Collaborative literature reviews
• Organized references
• Annotated PDFs www.mendeley.com
30
Shared data & analysis repositories stick.ischool.umd.edu/community
31
Researchers want to…
1. Find a specific paper
2. Explore a research area
3. Do retrospective analysis
4. Share their results
32
What Researchers Want
Cody Dunne
Dept. of Computer Science and
Human-Computer Interaction Lab,
University of Maryland cdunne@cs.umd.edu
Links from this talk: bit.ly/stmwant
This work has been partially supported by
NSF grants IIS 0705832 (ASE) and
SBE 0915645 (STICK)
33