WWW and InfoSphere

advertisement
Software Visualization
CS 4460 - Information Visualization
Many slides courtesy John Stasko,
with edits by J. Foley
Software Visualization
• Definition
“The use of the crafts of typography,
graphic design, animation, and
cinematography with modern humancomputer interaction and computer
graphics technology to facilitate both
the human understanding and effective
use of computer software.”
Price, Baecker and Small, ‘98
CS 4460/7450
2
Challenge
• Unlike much information visualization,
however, software is often dynamic, thus
requiring our visualizations reflect the
time dimension
 History views
 Animation
 ...
CS 4460/7450
3
Subdomains
• Two main subareas of software
visualization
 Program visualization
Use of visualization to help programmers, coders,
developers
Software engineering focus
 Algorithm visualization
Use of visualization to help teach algorithms and
data structures
Pedagogy focus
CS 4460/7450
4
Caveat
• This is a HUGH area
• Presentation goal: provide flavor of kinds
of techniques and systems that have been
created
 Lots of screen shots
 Some videos
CS 4460/7450
5
Program Visualization
• Can be as simple as enhanced views of
program source
• Can be as complex as views of the
execution of a highly parallel program, its
data structures, run-time heap, etc.
CS 4460/7450
6
PV is a big research Area
CS 4460/7450
7
PV is a Big Product Category
• Do Google searches on
 Program visualization
 Code visualization
 Software visualization
• Lots of companies/products
CS 4460/7450
8
What Can We Visualize?
• ??????
What Can We Visualize?
• Structure
 Static code structures
• Behavior
 Dynamic (execution) code behavior
 Test suite results
 Bug issues/relations
• Evolution
 Code repository structure/evolution
 Program team communications patterns
CS 4460/7450
10
Static Code Structures
•
•
•
•
Call graphs
Object class hierarchy
Scope of variables
Code module sizes
CS 4460/7450
11
Execution Data
Summaries, such as
 Total running time
 Number of times a method
was called
 Amount of time CPU was
idle
 Bytes read/written
 Memory high-water mark
 Etc etc etc
CS 4460/7450
Details, such as







Memory allocations
System calls
Cache misses
Page faults
Pipeline flushes
Process scheduling
Completion of disk reads or
writes
 Message receipt
 Application phases
 Etc etc etc
12
Test Results
• Hundreds, maybe thousands of tests
• For each test:
 Purpose
 Result (pass or fail)
Could be per-configuration or per-version
 Relevant parts of the code
CS 4460/7450
13
Really Detailed Execution Data
• Logging virtual machines can capture everything
 Enough data to replay program execution and
recreate the entire machine state at any point in time
 Allows “time-traveling”
 For long running systems, data could span months
• Uses:
 Debugging
 Understanding attacks
CS 4460/7450
14
SoftVis 2010 Program
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
An Interactive Ambient Visualization for Code Smells
CodePad: Interactive Spaces for Maintaining Concentration in Programming Environments
User Evaluation of Polymetric Views Using a Large Visualization Wall
Software Evolution Storylines
AllocRay: Memory Allocation Visualization for Unmanaged Languages
Heapviz: Interactive Heap Visualization for Program Understanding and Debugging
A Map of the Heap: Revealing Design Abstractions in Runtime Structures
Trevis: A Context Tree Visualization & Analysis Framework and Its Use for Classifying Performance Failure Reports
Exploring the Inventor's Paradox: Applying Jigsaw to Software Visualization
Dependence Cluster Visualization
Towards Anomaly Comprehension: Using Structural Compression to Navigate Profiling Call-Trees
Embedding Spatial Software Visualization in the IDE: An Exploratory Study
3D Kiviat Diagrams for the Interactive Analysis of Software Metric Trends
Graph Works - Pilot Graph Theory Visualization Tool
Visualizing Software Entities Using a Matrix Layout
ImpactViz: Visualizing Class Dependencies and the Impact of Changes in Software Revisions
VIPERS: Visual Prototyping Environment for Real-Time Imaging Systems
Towards Automated Analysis and Visualization of Distributed Software Systems
TIE: An Interactive Visualization of Thread Interleavings
GEM: Graphical Explorer of MPI Programs
Fault Forest Visualization
Visualizing Windows System Traces
Understanding Complex Multithreaded Software Systems by Using Trace Visualization
Zinsight: A Visual and Analytic Environment for Exploring Large Event Traces
Jype - A Program Visualization and Programming Exercise Tool for Python Off-Screen Visualization Techniques for Class Diagrams
An Automatic Layout Algorithm for BPEL Processes
Visual Comparison of Software Architectures
Representing Development History in Software Cities
Frank xDIVA: Automatic Animation Between Debugging Break Points
Understanding Relaxed Memory Consistency Through Interactive Visualization
the Execution of Object Orientated Concurrent Programs
CS 4460/7450
15
Commercial System Screen Shots
CS 4460/7450
16
Dependency Graph
Ndepend commercial product;
http://www.ndepend.com/SampleReports/OnDb4o/NDependReport.html#/?screen=Main
CS 4460/7450
17
Graph and Matrix Rep’n
NDepend
CS 4460/7450
18
Code Hierarchy Treemap; Lines of Code
NDepend
CS 4460/7450
19
Open Source System Screen Shots
• From The Source-NavigatorTM IDE
 http://sourcenav.sourceforge.net/index.html
CS 4460/7450
20
Hierarchy Browser Window
CS 4460/7450
21
Cross-Reference Browser Window
CS 4460/7450
22
Cross-Reference Browser Window
CS 4460/7450
23
Open Source System Screen Shots
• From jBixbe
 http://www.jbixbe.com/index.html
CS 4460/7450
24
Structure Diagram
CS 4460/7450
25
Message Exchanges
CS 4460/7450
26
Multi-threading
CS 4460/7450
27
Research Examples
• The following are discussions of a variety
of research prototypes
• Lots and lots of papers in Software
Visualization Conference Proceedings
CS 4460/7450
28
Pretty Printing – Common Example
CS 4460/7450
29
More Sophisticated
• But not so common to apply more
sophisticated graphic design
• Baecker & Marcus, Design Principles for
the Enhanced Presentation of Computer
Program Source Text, Proceedings CHI ’86
• No user testing 
• Looks nice 
CS 4460/7450
30
Baecker&Marcus Pretty Printing
CS 4460/7450
31
SeeSoft System
• Pulled-back, far away view of source code
• Map one line of source to one line of
pixels
 Maintain indentation & length
 Color code lines in meaningful way
• Like taping your source code to the wall,
walking far away, then looking back at it
CS 4460/7450
Stephen G. Eick, Joseph L. Steffen and Eric E.
Sumner, Jr. “SeeSoft – A Tool for Visualizing
Line-Oriented Software Statistics.” IEEE
Transactions on Software Engineering,
18(11):957-968, November 1992.
32
SeeSoft View
Selected
code
15,000 lines
of code in
52 files
Heat map for colorcoded characteristic
CS 4460/7450
Details of
selected code
33
SeeSoft
• Code tracking (typically means mapping a
data attribute to line color)




Code modification (when, by whom)
Location of bug fixes for specific bug report
Location of code to implement a specific feature
Code coverage or hotspots
• Interactive
 Change color mappings
 Link from heat map to code overview
 Brush views – back to source code
CS 4460/7450
34
Tarantula
• Developed at GT
• Utilizes SeeSoft code view methodology
• Takes results of test suite run and helps
developer find program faults
• Key is the clever color mapping
Eagan, Harrold, Jones & Stasko
InfoVis ’01
Jones, Harrold & Stasko
ICSE ‘02
CS 4460/7450
35
Tarantula View
Red – code
executed
by failed
tests
Green executed
by passed
tests
Yellow executed
by both
passed and
failed
CS 4460/7450
36
Tarantula – Selective Display
Failed
tests
only
CS 4460/7450
37
Tarantula: Continuous Colour Mapping
• Extend discrete colour mapping by
 Interpolating between red and green
 Adjusting brightness according to number of
tests
• Possibilities:
 Number of passed or failed tests
 Ratio of passed to failed tests
 Ratio of % passed to % failed
CS 4460/7450
38
Tarantula: Continuous Colour Mapping
• For each line L
 Let p and f be the percentages of passed and
failed tests that executed L
 If p = f = 0, colour L grey
 Else, colour L according to
Hue: p / ( p + f ), where 0 is red and 1 is green
Brightness: max( p, f )
CS 4460/7450
39
CS 4460/7450
40
Tarantula: Future Work
• Does it help find bugs?
 Seems like it should
• Link red lines to the tests
 Static
 Dynamic
CS 4460/7450
41
Gammatella - Visualization of ProgramExecution Data for Deployed Software
Orso, Jones and Harrold, Proc. Of ACM
Symp. on Software Visualization, June
2003, pp. 67-76.
Gammatella: Tri-Level Representation
• System level
 Treemap of package/class hierarchy
 Size – code lines
 Smaller areas colored differently according as
code lines colored differently
• File level:
 SeeSoft-like view of code
• Statement level:
 Source code (colored text)
CS 4460/7450
43
Gammatella screen shot
Multiple
linked
views
CS 4460/7450
44
Gammatella
• Displays program execution data
 From data base of executions
• Visualization of data from many executions
 Code coverage and profiling data
 Execution properties
OS
Java version
Errors
Etc etc etc
CS 4460/7450
45
Color Coding
• Two variables
 Hue: Red to yellow to green
 Intensity: Dark to light
• Various attributes can be mapped to color
 Execution profile: red frequent, green
infrequent; intensity not used
 Other mappings not discussed
What could be done???
CS 4460/7450
46
System-Level View
Gray: Never
executed code
CS 4460/7450
Execution
Bar
Colored subdivisions
within a module block
indicate amounts of
code with various
usages
47
Execution Bar
• Each small vertical bar represents one
program execution (test run)
 Color code – results of test run
• May be hundreds of test runs
• Treemap shows information for selected test
run(s)
 Select one or range of runs by pointing
 Select multiple non-contiguous runs via data
base query on test run properties
 For multiple executions, color codings are
averages
CS 4460/7450
48
Gammatella: Critique
• Complete system – not just a visualization
• Nicely links code to structure
• Trial usage discovered useful but highlevel information
 Mainly relied on system view
• Needs more usage to determine utility
• Complex color coding – color and hue –
hard to decode
CS 4460/7450
49
CVSscan: Visualization of Code
Evolution
“CVSscan: Visualization of Code
Evolution”, Voinea, Telea, and van
Wijk, SoftVis 2005
Original presentation by Summer Adams
CS 4460/7450
50
Overview
• Objective - understand code evolution over
multiple versions - support maintenance
• Want to answer questions such as
 What code lines were added, removed, or
altered?
When? By whom?
 Which parts of code are unstable?
 How are changes correlated?
 How are development tasks distributed?
 Context in which code segment appeared?
CS 4460/7450
51
Screen Shot
Versions
View of code,
line by line
and version by
version
Portion of code
from selected
version
CS 4460/7450
52
Approach
• Takes information from CVS (Concurrent
Versioning System) source code repository
• Uses UNIX’s diff command to compare
versions
• Visualizes each version as column
 Lines of code represented as in SeeSoft
As thin horizontal color-coded strip
CS 4460/7450
53
Two Code Visualization Formats
• File-based
• Line-based
Local: like
SeeSoft, just
current lines
CS 4460/7450
Global: includes
future inserts
54
Code Visualization - Examples
• 65 software
versions
• Color code
 Green – constant
 Yellow – modified
 Red - modified by
deletion
 Light blue - modified by
insertion
 Light gray in global
view - inserted and
deleted lines
CS 4460/7450
Code cleanup stage
Local: like SeeSoft, just current lines
Global: includes future inserts
55
Metrics
Per code
line metric
• Metrics for
each version
 Code size
• Metrics for
each line of
code
 Author
 Lifetime at
current line
• Other ideas for
metrics??
CS 4460/7450
Per version
metric
56
Overall View
Preset
zoom levels
CS 4460/7450
57
Use Scenario
Status
Construct
Author
The same code, viewed with three different attributes being colorcoded.
The modified piece of code (yellow status) is a comment (green status)
and was modified by programmer whose color is purple (would be nice
to have color key )
CS 4460/7450
58
CVScan Evaluation
• Modest – two users, each on different
code repository
 Perl Script, 65 versions, 457 lines
 C program, 60 versions, 2900 lines
• 15 minutes training, 15 minutes use
• Assessment
 General usability- think out loud
 Effectiveness –code understandings after use
CS 4460/7450
59
CVScan Evaluation Results
• Perl (65 versions, 457 lines)




Familiar with overall organization of the file
Focus of each developer
Areas of significant modifications
What modifications referred to
• C (60 versions, 2900 lines)
 Did not have clear image of code’s evolution
 Did conclude that was legacy code adapted by
mainly two users to support IPv6 network
protocol
 Pointed out major modification to memory
manager
CS 4460/7450
60
CVScan Critique
• Very modest evaluation – inconclusive
• Use of metrics not very well developed
• Nicely leverages SeeSoft to visualizing
multiple versions of same code
CS 4460/7450
61
Stepping Up
• Next step in program visualization is to
help debugging and performance
optimization by visualizing program
executions
 Data, data structures, run-time heap,
memory, control flow, data flow, hot spots, ...
CS 4460/7450
62
Graph Visualization
• Area that we’ve already studied which is
very important to SV
• Graphs pop up everywhere in software
• Common use: Visualize a call graph,
visualize a flow chart, ...
CS 4460/7450
63
Sample CallGraph View
CS 4460/7450
64
Software Projects
• Come up a level, to group of developers
working on a code base
• What might you want to understand?
 For instance, in an open source project?
CS 4460/7450
65
Stargate: An Author-Centric Approach
to Software Project Visualization
Ogawas and Ma, Stargate: An
Author-Centric Approach to
Software Project Visualization
Proceedings of IEEE PacificVis
2008 March, 2008
CS 4460/7450
66
Stargate – Start with Modified Sunburst
CS 4460/7450
67
Stargate – Add Developers
• Dots are
developers
 Placed close to
files they
worked on
 Size =>
amount of
activity
• File colors =>
File type
CS 4460/7450
68
Stargate – Add Communications
• Email
communications
among
developers
CS 4460/7450
69
Stargate – Add File History
CS 4460/7450
70
Stargate Video
• http://vidi.cs.ucdavis.edu/research/videos
/stargate
• (Show from SV folder)
CS 4460/7450
71
SV Takeaways
• Multiple dimensions to SV
 Static code/program relationships
 Dynamic code/program relationships
 Testing/coverge
 Project evolution
• Many methods we have studied are
applicable, such as ways to depict
 Time-varying behaviors
 Object relationships
CS 4460/7450
72
The End
CS 4460/7450
73
FIELD
• Program development and analysis
environment with a wide assortment of
different program views
 Integrated a variety of UNIX tools
 Utilized central message server architecture in
which tools communicated through message
passing
Reiss
Software Pract & Exp ‘90
CS 4460/7450
74
FIELD
Interface
CS 4460/7450
75
Dynamic Call Graph View
On call
stack
CS 4460/7450
Currently
active
76
Class Browser
CS 4460/7450
77
Heap View
Color can be
When allocated
Block size
Where allocated
CS 4460/7450
78
3D Call Graph
Selected file
Groups of calls
Collapsed file
CS 4460/7450
79
Visualizing Application Behavior
on Superscalar Processors
Chris Stolte, Robert Bosch, Pat
Hanrahan, and Mendel Rosenblum
Proc. InfoVis 1999
Superscalar Processors: Quick
Overview
• Pipeline
• Multiple Functional Units
 Instruction-Level Parallelism (ILP)
• Instruction Reordering
• Branch Prediction and Speculation
• Reorder Buffer
 Instructions wait to graduate (exit pipeline)
CS 4460/7450
81
CS 4460/7450
82
CS 4460/7450
83
CS 4460/7450
84
CS 4460/7450
85
CS 4460/7450
86
CS 4460/7450
87
CS 4460/7450
88
CS 4460/7450
89
Critique
• Most code doesn’t need this level of
optimization, but
 The visualization is effective, and would be
useful for code that does
 May reduce the expertise needed to perform
low level optimzation
• Might be effective as a teaching tool
• Bad color scheme: black/purple/brown
• Does it scale with processor complexity?
CS 4460/7450
90
Concurrent Programs
• Understanding parallel programs is even
more difficult than serial
• Visualization and animation seem naturals
for illustrating concurrency
• Temporal mapping of program execution
to animation becomes critical
Kraemer & Stasko
JPDC ‘93
CS 4460/7450
91
POLKA
• We developed a system, POLKA, that was
designed to help people build
visualizations of concurrent programs
 Used for both program and algorithm
visualization
 Used for different programming models:
message passing, shared memory threads, ...
Stasko & Kraemer
JPDC ‘93
CS 4460/7450
92
Message Passing Systems
PVM/Conch
Topol, Stasko, Sunderam
IJPDSN ‘98
CS 4460/7450
93
Shared Memory Threads
CS 4460/7450
Pthreads
Zhao & Stasko TR ‘95
94
StrataVarious: MultiLayer Visualization of
Dynamics in Software System Behavior
Doug Kimelman, Bryan Rosenburg, Tova Roth
Proc. Fifth IEEE Conf. Visualization ’94, IEEE
Computer Society Press, Los Alamitos, Calif.,
1994, pp. 172–178.
StrataVarious
• Trace-driven program visualization
• Trace: sequence of <time, event> pairs
• Events captured from all layers (strata,
hence the name):
 Hardware
 Operating System
 Application
• Replay execution history
• Coordinate navigation of event views
CS 4460/7450
96
StrataVarious: Why Multi-Strata?
• Debugging and tuning requires
simultaneously analyzing behaviour at
multiple layers of the system
CS 4460/7450
97
Accumulated
process times
Call tree
Kernel
performance
stats
CS 4460/7450
Process
run history
98
CS 4460/7450
99
Goals
• Facilitate debugging by providing
execution time information
• Faster assimilation of execution data
• Facilitate visual correlations, trends and
anomalies
• Several layers including user-level
libraries, OS and hardware
CS 4460/7450
100
PV
• Trace-driven
• Trace: Time ordered sequence of events of
interest
• Synchronous and asynchronous
configurations
• Multiple concurrent views representing the
different layers
CS 4460/7450
101
Views
• Process scheduling and System activity
 Scheduling view
 Activity view
CS 4460/7450
102
Views
• Memory Activity and Application Progress
 Memory size view
 Source of memory allocation
 State of the page of the data segment
CS 4460/7450
103
Memory view
Code
views
CS 4460/7450
104
Views
• Hardware Activity and Source Progress
 Active loop view
 Hardware performance statistics
CS 4460/7450
105
Conclusion
• A more effective and comprehensive
visualization of software
• Worthwhile for developers facing
performance issues
• Room for improvement in visualization
• More direct traversal
• Presentation of derived data
CS 4460/7450
106
StrataVarious: Critique
• Examples demonstrate usefulness
• Fundamentally, a good idea
 Increasing importance as multi-core machines
become standard
• Many windows
 Titles not meaningful
• Dubious claim that tracing does not alter
behavior
CS 4460/7450
107
What is Useful? 1991 Survey
• Very useful
14,1,18,32,2,
19,5
• Useless
 22,33,34
• All rest
useful but
not essential
Software Visualization Tools: Survey and Analysis
Sarita Bassil and Rudolf K. Keller, 1991
CS 4460/7450
108
What is Useful?
• Very useful
14,1,18,32,2,
19,5
• Useless
 22,33,34
• All rest
useful but
not essential
CS 4460/7450
109
Download