Visualization Considerations for Interactive Epoch- Era Analysis

Visualization Considerations for Interactive EpochEra Analysis
by
Varsha J. Raghavan
S.B., Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 2014
Submitted to the Department of Electrical Engineering and Computer Science
in Partial Fulfillment of the Requirements for the Degree of
Master of Engineering in Electrical Engineering and Computer Science
at the Massachusetts Institute of Technology
June 2015
©2015 Massachusetts Institute of Technology
All rights reserved.
Author:
Department of Electrical Engineering and Computer Science
May 22, 2015
Certified by:
Dr. Adam M. Ross
Research Scientist, Engineering Systems
Lead Research Scientist, Systems Engineering Advancement Research Initiative
Thesis Supervisor
May 22, 2015
Certified by:
Dr. Donna H. Rhodes
Principal Research Scientist and Senior Lecturer, Engineering Systems
Director, Systems Engineering Advancement Research Initiative
Thesis Co-Supervisor
May 22, 2015
Accepted by:
Prof. Dennis M. Freeman
Chairman, Masters of Engineering Thesis Committee
1
2
Visualization Considerations for Interactive Epoch-­‐Era Analysis by Varsha J. Raghavan
Submitted to the Department of Electrical Engineering and Computer Science
in Partial Fulfillment of the Requirements for the Degree of
Master of Engineering in Electrical Engineering and Computer Science
Abstract Epoch-Era Analysis (EEA) is a quantitative analysis approach that allows decision-makers to
evaluate performance of design alternatives over a set of possible futures. To address the
computational and cognitive burden arising from a potentially unlimited number of futures, the
insertion of human-in-the-loop interaction into certain modules of EEA, or Interactive Epoch-Era
Analysis (IEEA), was proposed. This thesis discusses user goals of these modules as well as
principles of data visualization and user interface design, and evaluates the functionality and
usability of selected visualizations and interfaces accordingly in the context of these IEEA
modules.
Thesis Supervisor: Adam M. Ross
Title: Research Scientist, Engineering Systems, Systems Engineering Advancement Research
Initiative
3
4
Acknowledgments The author wishes to thank a number of individuals without whom this work would not be
possible: Most importantly, to Dr. Adam Ross and Dr. Donna Rhodes, thank you both so much
for allowing me the opportunity to join SEAri and for giving me the necessary guidance, tools,
support, and prodding to accomplish all I did. Thanks also to all SEAri students who lent me
their time and advice over my year here, especially Mike, Matt, and Paul for giving me all the
answers I asked (and didn’t ask) for. Thank you to all my friends and classmates here (especially
Rachael!) that have pretty much kept me sane and motivated to keep going throughout MIT.
Lastly, to my mom and dad, thank you for all of your endless love and encouragement, and for
never giving up on me.
5
6
Table of Contents Visualization Considerations for Interactive Epoch-­‐Era Analysis ....................................... 3 Abstract ................................................................................................................................................... 3 Acknowledgments ............................................................................................................................... 5 Table of Contents ................................................................................................................................. 7 List of Figures ...................................................................................................................................... 10 List of Tables ........................................................................................................................................ 13 Chapter 1: Introduction ................................................................................................................... 15 1.1 Background ............................................................................................................................................. 15 1.2 Thesis Overview .................................................................................................................................... 19 Chapter 2: Functionality Criteria .................................................................................................. 21 2.1 Sampling Module Activities ............................................................................................................... 21 2.1.1 Epoch Sampling Submodule ......................................................................................................................... 21 2.1.2 Era Sampling Submodule ............................................................................................................................... 22 2.2 Analysis Module Activities ................................................................................................................. 23 2.2.1 Single-­‐Epoch Analysis Submodule ............................................................................................................. 23 2.2.2 Multi-­‐Epoch Analysis Submodule .............................................................................................................. 24 2.2.3 Single-­‐Era Analysis Submodule ................................................................................................................... 25 2.2.4 Multi-­‐Era Analysis Submodule .................................................................................................................... 25 Chapter 3: Usability Criteria ........................................................................................................... 27 3.1 Overall Design Considerations ......................................................................................................... 27 3.2 Usability .................................................................................................................................................... 28 3.2.1 Learnability .......................................................................................................................................................... 29 3.2.2 Efficiency ............................................................................................................................................................... 29 3.2.3 Error-­‐Tolerance ................................................................................................................................................. 30 Chapter 4: Visual Analytics ............................................................................................................. 31 4.1 Geometric Visualizations .................................................................................................................... 32 4.1.1 Scatterplots .......................................................................................................................................................... 32 4.1.2 Line Graphs .......................................................................................................................................................... 36 4.1.3 Parallel Coordinate Plots ............................................................................................................................... 36 4.1.4 Force Diagrams (Interactive) ....................................................................................................................... 38 4.1.5 Sankey Diagrams ............................................................................................................................................... 39 4.2 Pixel-­‐Based Visualizations ................................................................................................................. 40 4.2.1 Pixel Bar Charts .................................................................................................................................................. 40 4.2.2 Color Mapping (Heatmaps) ........................................................................................................................... 41 4.3 Icon-­‐Based Visualizations .................................................................................................................. 42 4.3.1 Star Plots ............................................................................................................................................................... 42 7
4.3.2 Chernoff Faces .................................................................................................................................................... 43 4.3.3 Stick Figures ........................................................................................................................................................ 44 4.3.4 Color/Shape Icons ............................................................................................................................................. 45 4.4 Hierarchy-­‐Based Visualizations ....................................................................................................... 45 4.4.1 Hierarchical Axes ............................................................................................................................................... 45 4.4.2 Trees ....................................................................................................................................................................... 45 4.4.3 Treemaps .............................................................................................................................................................. 46 4.4.4 Circle Packing ...................................................................................................................................................... 47 4.5 Data Interaction Techniques ............................................................................................................. 49 4.5.1 Drag and Drop ..................................................................................................................................................... 49 4.5.2 Selection ................................................................................................................................................................ 50 Chapter 5: Functionality and Usability Examination of IEEA Epoch Sampling Submodule ............................................................................................................................................ 57 5.1 Functionality ........................................................................................................................................... 57 5.1.1 Scatterplots/Bubble Charts .......................................................................................................................... 57 5.1.2 Parallel Coordinate Plots ............................................................................................................................... 58 5.1.3 Trees ....................................................................................................................................................................... 59 5.1.4 Treemaps and Circle Packing ....................................................................................................................... 60 5.1.5 Evaluation Summary ........................................................................................................................................ 62 5.2 Implementation ..................................................................................................................................... 64 5.3 Usability .................................................................................................................................................... 66 5.3.1 Learnability .......................................................................................................................................................... 66 5.3.2 Efficiency ............................................................................................................................................................... 67 5.3.3 Error-­‐Tolerance ................................................................................................................................................. 67 Chapter 6: Functionality Examination for Other IEEA Submodules ................................. 69 6.1 Era Sampling ........................................................................................................................................... 69 6.1.1 Parallel Coordinates ......................................................................................................................................... 69 6.1.2 Sankey Diagrams ............................................................................................................................................... 70 6.1.3 Tree Structures ................................................................................................................................................... 71 6.1.4 Bar Chart Icons ................................................................................................................................................... 72 6.1.5 Drag and Drop ..................................................................................................................................................... 73 6.1.6 Evaluation Summary ........................................................................................................................................ 74 6.2 Multi-­‐Epoch Analysis ............................................................................................................................ 76 6.2.1 Scatterplot Variants .......................................................................................................................................... 76 6.2.2 Epochs as Parallel Coordinates ................................................................................................................... 78 6.2.3 Circular Extensions ........................................................................................................................................... 78 6.2.4 Evaluation Summary ........................................................................................................................................ 79 6.3 Single-­‐Era Analysis ............................................................................................................................... 80 6.3.1 Designs as Trees ................................................................................................................................................ 81 6.3.2 Line Graphs .......................................................................................................................................................... 82 6.3.3 Parallel Coordinates ......................................................................................................................................... 83 6.3.4 Scatterplot Matrices ......................................................................................................................................... 84 8
6.3.5 Evaluation Summary ........................................................................................................................................ 84 6.4 Multi-­‐Era Analysis ................................................................................................................................. 86 6.4.1 Line Graph Matrix ............................................................................................................................................. 86 6.4.2 Line Graphs to Represent One Design ...................................................................................................... 87 6.4.3 Sankey Diagrams ............................................................................................................................................... 88 6.4.4 Evaluation Summary ........................................................................................................................................ 89 Chapter 7: Discussions and Conclusion ...................................................................................... 91 Bibliography ........................................................................................................................................ 93 Appendix ............................................................................................................................................... 97 9
List of Figures Figure 1-1: Activities involved in Epoch-Era Analysis (taken from Curry 2015) ....................... 17 Figure 2-1: Illustration of the concept of Fuzzy Pareto Optimality, where K is the level of
“fuzziness” applied to the Pareto front (left) to create the Fuzzy Pareto Front (shaded area,
right). Graphic taken from (Schaffner et al., 2014) .............................................................. 24 Figure 4-0: The visual analytics process, taken from (Keim, et al. 2010). ................................... 31 Figure 4-1: Example of scatterplot ............................................................................................... 33 Figure 4-2: Example of bubble chart ............................................................................................ 34 Figure 4-3: Scatterplot Matrix of a 6-dimensional car dataset, with variables plotted pairwise,
from (Hoffman, 1999)........................................................................................................... 35 Figure 4-4: Scatterplot Matrix with histograms plotted along diagonal, from (Grinstein, 2001) . 35 Figure 4-5: Line graph with multiple lines, from (Wallace 2004) ............................................... 36 Figure 4-6: Example of Parallel Coordinate Plot, from Wikipedia .............................................. 37 Figure 4-7: Polar chart showing Iris Flower dataset (left) and RadViz showing example car
dataset (right). Both images taken from (Hoffman 1999)..................................................... 38 Figure 4-8: Force-Directed Graph depicting character co-occurrence in “Les Miserables,” from
(Bostock, 2012) ..................................................................................................................... 39 Figure 4-9: Sankey diagram showing a possible scenario for UK energy production and
consumption in 2050, with supply on the left and demands on the right, from (Bostock,
2012) ..................................................................................................................................... 40 Figure 4-10: Equal-height pixel bar chart with color encoding different attributes, from (Chan,
2006) ..................................................................................................................................... 41 Figure 4-11: Heatmaps encoding data in every pixel. Random data set encoded into a 10x10
pixel square (left) from (Grinstein, 2001), and local thermal power data encoded into a map
of the whole US (right) from (ICM Consulting 2015).......................................................... 42 Figure 4-12: 36 twelve-dimensional data points represented as star plots, and organized by
“weight” (bottom variable), from (Friendly, 1991) .............................................................. 43 Figure 4-13: Different Chernoff facial features (left) and Chernoff faces plotted in various 2D
positions on a scatterplot (right), taken from (Chan, 2006) .................................................. 44 Figure 4-14: A family of 12 stick figures (left) and a scatterplot of stick figures (right), taken
from (Liu 2014)..................................................................................................................... 44 Figure 4-15: Splitting scheme of hierarchical axes (left) next to the final histograms-withinhistograms matrix visualization (right), from (Chan, 2006) ................................................. 45 Figure 4-16: Example unlabeled tree visualization, from (BigML Blog 2012) ........................... 46 Figure 4-17: Example treemap of country population by continent, from (Veroy 2013)............. 47 Figure 4-18: Example circle packing layout from Mike Bostock’s website ................................ 48 Figure 4-19: An example resizable object, denoted by the grooved “grippable” corner .............. 50 Figure 4-20: An example of checkboxes vs. radio buttons, from (Lepofsky 2015) ..................... 50 Figure 4-21: An example of toggle switches, from XOO.me design directory ............................ 51 10
Figure 4-22: Examples of filters for different types of variables: Size, Designer, and Color all
allow discrete selection of respective values (numerical and categorical), and the slider
(bottom right) allows selection for the continuous variable Price (as shown here, the
selection allows values from $0-$750). All taken from an actual retail website, Rent the
Runway ................................................................................................................................. 52 Figure 4-23: An example of data brushing, taken from Mike Bostock’s website. The data was
selected in the top-left box, and is colored the same across all scatterplots in the matrix. ... 53 Figure 5-1: Example of IEEA Epoch Sampling implemented as a scatterplot. The epoch variables
were “Tech Level,” with values “future” or “present,” and “User Preference,” with values 18............................................................................................................................................. 58 Figure 5-2: Example of IEEA Epoch Sampling sketched as a Parallel Coordinate Plot .............. 59 Figure 5-3: Example of IEEA Epoch Sampling on NGSC data implemented as a Tree .............. 60 Figure 5-4: Treemap visualization of NGCS epochs ................................................................... 61 Figure 5-5: Circle packing visualization of NGCS epochs as seen at different zoom levels ....... 62 Figure 5-6: Start state of Epoch Sampling interface ..................................................................... 64 Figure 5-7: IDs of selected epochs (top) along with current state of top-right box of interface
displaying fraction of epochspace selected (12 epochs out of 108 total; bottom) ................ 65 Figure 5-8: Partially expanded tree (using NGCS database epoch variables/values) in
implemented interface ........................................................................................................... 65 Figure 5-9: The bottom box of the Epoch Sampling interface in “SELECT mode” .................... 66 Figure 6-1: Automatically enumerated era set represented as a Sankey diagram of epoch flows.71 Figure 6-2: A single era represented as a series of epochs along a single axis............................. 72 Figure 6-3: An unorganized set of 7 eras (represented by bar chart icons) with hue, color, and
orientation as additional encoding. ....................................................................................... 73 Figure 6-4: Sketch of leveraging drag-and-drop and resizing functionalities for manual era
creation .................................................................................................................................. 74 Figure 6-5: Bubble chart paired with parallel coordinates that allow user to choose which
attributes to plot (taken from Rhodes and Ross, 2015) ......................................................... 77 Figure 6-6: Parallel coordinate plot showing Fuzzy Pareto Number of 7 designs (horizontal lines)
being plotted over 6 epochs (vertical axes)........................................................................... 78 Figure 6-7: Parallel sets (Sankey) visualization of designs following a changeability strategy
from frame to frame (i.e. every transition), from (Schaffner 2014). Color coded by start
design, horizontal line size “reflects the proportion of clips in which the corresponding
design number appears in that frame” .................................................................................. 82 Figure 6-8: Two line graphs showing MAU (left) and MAE (right) of 6 designs over the course
of a 4-epoch era (Epoch 1: 3 yrs, Epoch 2: 3 yrs, Epoch 3: 2 yrs, Epoch 4: 2yrs), from
(Schaffner 2014) ................................................................................................................... 83 Figure 6-9: Two layouts of a scatterplot matrix showing a four-epoch era .................................. 84 Figure 6-10: A selection of four eras displayed in a line graph matrix ........................................ 87 11
Figure 6-11: Line graph showing trajectories (in terms of MAU) for one design (and subsequent
changes/options) across 3 eras .............................................................................................. 88 12
List of Tables Table 4-1: Summary of visualizations presented in this chapter (Sections 4.1-4.4). Includes
visualization names, brief notes about their major strengths/capabilities, the number of
dimensions supported (either a number or number range, multidimensional [meaning 2+],
or in the case of Force Diagrams, not applicable), the variable types supported (Discrete,
Continuous, or Any), the dataset size supported (Small, Med, Large, or Any), and the types
of interactions supported from those presented in Section 4.5. ............................................ 54 Table 5-1: Summary of characteristics for each visualization (Sec. 5.1.1-5.1.4), with best
alternative for each row underlined. “Fine” represents a visualization is passable – not
helpful, but not unhelpful. ..................................................................................................... 63 Table 6-1: Summary of characteristics for each visualization (Sec. 6.1.1-6.1.4), with best
alternative for each row underlined. “Fine” represents a visualization is passable – not
helpful, but not unhelpful. ..................................................................................................... 75 Table 6-2: Summary of characteristics for each visualization (Sec. 6.2.1-6.2.3)......................... 80 Table 6-3: Summary of characteristics for each visualization (Sec. 6.3.2-6.3.4)......................... 85 Table 6-4: Summary of characteristics for each visualization (Sec. 6.4.1-6.4.3)......................... 90 13
14
Chapter 1: Introduction Systems engineering, according to the International Council on Systems Engineering (INCOSE),
is “an interdisciplinary approach and means to enable the realization of successful systems.”1
Such systems can range from spacecraft manufacture to computer chip design, from impacting
hundreds to millions of people, locally or globally. The field encompasses the design, operation,
and management of these systems, focusing on customer needs and required functionality before
proceeding with manufacture. In order to start designing these systems, a number of decisions
must be made about the specifications of the system.
There are several methods for analyzing and choosing from various available design alternatives,
and one or more could be utilized depending on the type of problem, time and resources
available, end goal, etc. Regardless of the method(s) used, it is advantageous for the decisionmaker to leverage computer software to perform the analysis, as machines are able to process
and display information much faster than humans can manually. However, for a computer
program to be useful, it must not only have the ability to correctly perform the tasks the user
requires (functionality), but also the user must be able to understand how to use it in order to
optimize satisfaction and results (usability). This thesis will draw principles of data visualization
and user interface design from the field of computer science to address these two properties in
depth and examine them in the context of Epoch-Era Analysis, a systems engineering framework
for decision-making in across uncertain futures, described in the next section.
1.1 Background Traditional systems engineering analysis approaches develop system specifications under the
assumption of relatively static environments and stakeholder needs. Cost-benefit analysis, for
example, in which a decision-maker assigns a measure of “utility” and “cost” to each design
alternative and selects designs in the so-called “tradespace” that maximize utility while
minimizing cost (designs on, or close to, the Pareto front), allows system designers to optimize
the system as defined in the current context.
Unfortunately, this approach does not account for the changes in environment and stakeholder
needs that almost inevitably will occur over the entire life cycle of a system. Clients will likely
change their minds about what they want (i.e. needs) or where they want to use it (i.e. context),
resulting in “change requests” that systems engineers operating under the aforementioned
assumption will likely not have accounted for in their previously developed specifications. This
illustrates that analyses performed under the assumption of static context and needs do not ensure
that a system will continually deliver value and meet stakeholder expectations in the face of
changing contexts and needs throughout its life cycle. To address this problem, Ross and
1
"What Is Systems Engineering?" INCOSE. International Council on Systems Engineering, 14 June 2004.
<http://www.incose.org/practice/whatissystemseng.aspx>.
15
Rhodes, of MIT’s Systems Engineering Advancement Research Initiative (SEAri), introduced
Epoch-Era Analysis (EEA), an analysis approach that allows decision-makers to evaluate
performance of design alternatives over a set of possible futures (Ross 2006; Ross and Rhodes
2008)2,3.
The fundamental unit of EEA is the epoch, a period of time with fixed context (e.g. political,
economic) and needs (of any stakeholders). Each epoch can be described with a combination of
epoch variables, which represent important uncertainty factors in contexts and needs that could
potentially affect system performance. For example, a case study commonly used (seen in
Fitzgerald, et al. 20124,5,6; Fulcoly et al. 20127) involves the design of a space tug, with two
defined epoch variables: Technology Level and User Preference. There are two values of
Technology Level (Present and Future contexts) and eight different user preferences (or sets of
stakeholder needs), making for sixteen total epochs in this study.
An ordered string of epochs, each with a defined duration, is called an era. Once a set of epochs
or eras is generated, users may compare and evaluate designs in different epochs or eras. Figure
1-1 shows the activities involved in EEA: Once users have identified decisions to be made and
all relevant epoch and design variables (“Problem Definition” and “Design Formulation”), they
can create epochs through enumeration and selection of epoch variables (“Epoch
Characterization”). If users are planning on performing era analysis, they must also construct
eras by selecting and ordering epochs (“Era Construction”). After developing models through
which to evaluate designs in each epoch (e.g. a measure of utility or expense, “Design-EpochEra Evaluations”), users can perform Single- or Multi- Epoch or Era Analysis to better
understand how the designs they selected will fare over the set of possible futures they selected.
As each of these analyses has slightly different requirements and goals (discussed more in the
next chapter), it is important that they each be represented and handled accordingly.
2
Ross, A.M., “Managing Unarticulated Value: Changeability in Multi-Attribute Tradespace Exploration,” PhD
thesis, MIT Engineering Systems Division, June 2006.
3
Ross, A.M., and Rhodes, D.H., "Using Natural Value-centric Time Scales for Conceptualizing System Timelines
through Epoch-Era Analysis," INCOSE International Symposium 2008, Utrecht, the Netherlands, June 2008
4
Fitzgerald, M.E., Ross, A.M., and Rhodes, D.H., "Assessing Uncertain Benefits: a Valuation Approach for
Strategic Changeability (VASC)," INCOSE International Symposium 2012, Rome, Italy, July 2012.
5
Fitzgerald, M.E. and Ross, A.M., "Mitigating Contextual Uncertainties with Valuable Changeability Analysis in
the Multi-Epoch Domain," 6th Annual IEEE Systems Conference, Vancouver, Canada, March 2012
6
Fitzgerald, M.E. and Ross, A.M., "Sustaining Lifecycle Value: Valuable Changeability Analysis with Era
Simulation," 6th Annual IEEE Systems Conference, Vancouver, Canada, March 2012.
7
Fulcoly, D.O., Ross, A.M., and Rhodes, D.H., "Evaluating System Change Options and Timing using the Epoch
Syncopation Framework," 10th Conference on Systems Engineering Research, St. Louis, MO, March 2012.
16
Figure 1-1: Activities involved in Epoch-Era Analysis (taken from Curry 2015)
Epoch-Era Analysis was originally introduced to provide an extension to single-context
tradespace exploration. Recently, many case studies using parts, or all, of Epoch-Era Analysis
have been conducted on different datasets for different purposes, including (Fulcoly 2012)8,
(Pina 2009)9, (Rader 2014)10, and (Schaffner 2014)11, where EEA has not only been
demonstrated to help choose best designs over a whole system lifecycle, but also to gain insights
about different characteristics of the tradespace and effects of different design attributes and
futures.
Schaffner (2014) shows that the number of possible epochs and eras generated by enumerating
epoch variables can quickly exceed a feasible number for users to explore: If an epoch can be
described with V epoch variables, each Vi of which has Li levels, the number of total epochs a
system can experience, NEpochs, can be described as:
NEpochs =
!!!
!!! 𝐿!
To then construct an era, a number of these NEpochs epochs must be selected, ordered, and
assigned durations. Even if we simplify the process by allowing only a selection of n epochs and
assuming that all epochs have the same duration, the total number of possible eras a system can
experience (assuming any epoch can transition to any other epoch), NEras, is at most:
!
!!!
𝑁!"#$ = 𝑁!"#$!! ! =
𝐿!
!!!
It should follow that the size of the era space is necessarily greater than or equal to the size of the
epoch space. Schaffner’s example of a model of 5 epoch variables with 3 levels by these
calculations resulted in NEpochs = 243 possible epochs, and, assuming n is in the range of 15 to 20,
a maximum of between NEras = 6 x 1035 and NEras = 5 x 1047 possible eras on which to perform
8
Fulcoly, D.O., Ross, A.M., and Rhodes, D.H., "Evaluating System Change Options and Timing using the Epoch
Syncopation Framework," 10th Conference on Systems Engineering Research, St. Louis, MO, March 2012.
9
Pina, A.L. “Applying Epoch-Era Analysis for Homeowner Selection of Distributed Generation Power Systems,”
Master of Science Thesis, Engineering and Management, Massachusetts Institute of Technology, June 2014.
10
Rader, A.A., Ross, A.M., and Fitzgerald, M.E., "Multi-Epoch Analysis of a Satellite Constellation to Identify
Value Robust Deployment across Uncertain Futures," AIAA Space 2014, San Diego, CA, August 2014.
11
Schaffner, M.A., “Designing Systems for Many Possible Futures: The RSC-based Method for Affordable Concept
Selection (RMACS), with Multi-Era Analysis,” Master of Science Thesis, Aeronautics and Astronautics,
Massachusetts Institute of Technology, June 2014.
17
analyses. If the simplifying assumptions are removed, this maximum number of possible eras can
grow potentially boundlessly.
The availability of so many epochs and eras could potentially result in biased or uninformative
analysis. To address this, Curry et al. have proposed a framework for Interactive Epoch-Era
Analysis (IEEA), in which certain EEA activities are performed with human feedback, as seen in
Figure 1-2. Curry and Ross hypothesized that this interactivity would enable improved decisionmaking intuition and insight, as well as “intelligently limit the potential unbounded growth in the
epoch/era space” (Curry 2015)12.
Figure 1-2: A framework for Interactive Epoch-Era Analysis, showing five “modules” with human
feedback (taken from Rhodes and Ross 2015)13
This framework can more easily be abstracted into six main modules:
Elicitation of relevant epoch and design variables (often through interview),
Generation of all epochs and design tradespaces (often including enumeration),
Sampling of epochs and eras in which to evaluate design choices,
Evaluation of designs in sampled subset of epochs and eras
Analyses of design choices in the previously evaluated epochs and eras, and finally
Decisions of final designs based on iterative evidence from previous modules.
12
Curry, M.D. and Ross, A.M., "Considerations for an Extended Framework for Interactive Epoch-Era Analysis,"
CSER 2015.
13
Rhodes D.H. and Ross A.M., Interactive Model-Centric Systems Engineering (IMCSE) Phase Two Technical
Report SERC-2015-TR-048-2; February 2015.
18
While the sequence of these modules flows logically, IEEA is intended to be an iterative process
where users can go back and change responses within earlier modules at any point to reflect what
they have learned from later ones.
Elicitation and generation have been primarily a human task, with some structured support via
static documentation; sampling, however, is the first module in the framework that can clearly
benefit from human-computer interaction and feedback. In this module, the human must make
sense of, and decide upon, which subset of epochs and eras to spend computational and human
attention (i.e. scarce) resources. This module can be thought of as encompassing two submodules of IEEA: Epoch Sampling and Era Sampling. Visualization and feedback are key tasks
for the user in order to interact with the data representing possible epoch and era subset samples
from the generated larger epoch and era spaces. Evaluation again is primarily a human task as it
requires judgment above computational power, but the subsequent analyses module
(encompassing the submodules of Single-Epoch, Multi-Epoch, Single-Era, and Multi-Era
Analyses) is the opposite, requiring as much interaction and human-computer feedback as
necessary for a user to fully explore all of the design options he is faced with to finally make a
decision.
1.2 Thesis Overview The work presented in this thesis aims to contribute to an overall research effort to demonstrate
that adding interactivity to interfaces increases user satisfaction, through elevated functionality
and usability. The overview of this thesis is now described. Chapter 2 will propose criteria for
evaluating designs with respect to functionality, while Chapter 3 will present usability criteria,
including overall practices of “good” graphic design. Chapter 4 will present an overview of the
field of visual analytics, including a survey of existing visualization and data manipulation
techniques. Chapter 5 will evaluate the functionality and usability of potential visualization
interfaces for the introduced Epoch Sampling submodule, and Chapter 6 will go on to evaluate
the functionality of potential visualizations for the Era Sampling as well as the IEEA MultiEpoch, Single-Era, and Single-Epoch Analysis submodules. Note that Single-Epoch Analysis
will not be considered in this thesis, as this submodule by itself bears no fundamental difference
to the aforementioned traditional analysis under static contexts/needs assumptions. Finally,
Chapter 7 will provide concluding thoughts about the research contributions of this thesis as well
as considerations for future work.
19
20
Chapter 2: Functionality Criteria As mentioned above, this thesis will evaluate the functionality and usability of potential
interfaces for five interactive submodules in IEEA: Epoch Sampling, Era Sampling, and MultiEpoch, Single-Era, and Single-Epoch Analysis. Before we examine usability criteria, in this
chapter we first discuss these submodules a bit further and propose a set of user-centric criteria
for each submodule, adapted from Curry et al.’s hypotheses regarding IEEA (Curry 2015), with
which to evaluate the functionality of its proposed interfaces. These criteria were then presented
to three current SEAri graduate students who identified as novice to expert Epoch-Era Analysis
practitioners, along with misleading criteria and opportunity for write-in criteria, and the
strongest responses for each submodule have been included here. It is important to note this
distinction before proceeding: This thesis does not aim to develop or evaluate functionality
related to analysis models, strategies/procedures, or modules or submodules of IEEA themselves,
but rather it aims to introduce considerations for visualization interfaces for and evaluate the
functionality of these pre-developed submodules alone.
2.1 Sampling Module Activities By the time users have gotten to the Sampling module in IEEA, the assumption is that they will
already have brainstormed all relevant epoch variables, as well as enumerated all of the possible
epochs by taking a full factorial of all variable combinations. It should be noted that in practice,
data for every one of these epochs will not necessarily be generated already, so in many cases,
only a fraction of this enumeration is readily available for analysis.
2.1.1 Epoch Sampling Submodule The end result of Epoch Sampling is for users to have selected a number of epochs with which to
construct eras and/or conduct Single- or Multi-Epoch Analysis. In order to get anything useful
later from the analysis module, users must start to have a thorough understanding of the concept
of epochs and what kinds of impacts they can have on later analysis, so it is very important to
pick user goals that stress this understanding:
•
•
•
The user should understand how each of the epochs are defined in the dataset (e.g. epoch
variables and values; what is a context and what is a need, etc.).
Based on this, the user should be able to find and select epochs that he deems important
on which to conduct further analysis.
The user should understand a) the size of the epoch space, b) what fraction is available to
explore (for which epochs data has already been generated), c) what fraction of this has
already been explored or selected to explore, helping to “intelligently limit the potentially
unbounded growth in the epoch/era space.”
21
2.1.2 Era Sampling Submodule Similar to Epoch Sampling, the end result of Era Sampling is for users to have selected a number
of eras with which to conduct Single- or Multi-Era Analysis while starting to understand the
concept and potential impact of eras on these later analyses. As shown in the previous chapter,
the size of the era-space is necessarily larger than that of the epoch-space; in fact, there are
always an infinite number of possible eras that can be enumerated from epochs, as each era is
constructed by stringing together multiple epochs, each with a [potentially infinite] number of
durations to choose from, thus it would be futile for a user to visualize the size of the era space,
or even be confident that the fraction he explores is representative of the possible futures.
There are two main categories of era construction methods: Computational (automatic) and
Narrative (manual). Fitzgerald and Ross describe a simple era simulator, that exemplifies the
former category, that automatically “constructs a stochastic sequence of epochs over which
designs will be valued” by randomly selecting epochs in succession14. The alternative to this type
of automatic era construction is the [less computationally taxing] method of manually creating
eras through narrative, or writing story-like scenarios to explain and dictate the change in
contexts and needs over time, as performed in (Schaffner, 2014; Pina, 2014). Roberts, et al. point
out that since fewer eras can be created and analyzed through narrative-based approaches due to
their time-intensiveness, these generally consider extreme scenarios15. For both of these methods,
the goal of understanding the size of the era space (analogous to part (a) of the third goal in
Epoch Sampling above) is removed since it is always infinite.
After constructing eras through either method (as part of the IEEA Generation module), a user
may still want to sample eras from this subset of all possible eras, thus leading him to perform
Era Sampling as part of the IEEA Sampling module. This will be most common in the case of
computationally generated eras, as users will tend to only manually construct eras they intend on
analyzing in the first place. Thus, this section will list goals for era sampling from
computationally generated eras, as well as for “manual sampling,” or manually deciding on and
constructing eras.
•
Sampling from computationally-generated eras:
a) The user should understand how eras are defined and represented in the
interface (e.g. epochs and durations)
b) Based on this, the user should be able to find and select important eras on
which to conduct further analysis.
14
Fitzgerald, M.E. and Ross, A.M., "Sustaining Lifecycle Value: Valuable Changeability Analysis with Era
Simulation," 6th Annual IEEE Systems Conference, Vancouver, Canada, March 2012.
15
Roberts, C.J., Richards, M.G., Ross, A.M., Rhodes, D.H., and Hastings, D.E., "Scenario Planning in Dynamic
Multi-Attribute Tradespace Exploration," 3rd Annual IEEE Systems Conference, Vancouver, Canada, March 2009.
22
•
c) The user should understand i) how much of the era space is available to
explore (for which eras data has already been computationally generated), ii)
what fraction of this has already been explored or selected to explore.
Sampling through manual era generation:
a) The user should be able to create an era by choosing epochs and setting their
durations.
b) Based on this, the user should be able to find and select important eras on
which to conduct further analysis after they have been created.
c) The user should understand i) how much of the era space is available to
explore (how many eras have been generated thus far), ii) what fraction of this
has already been explored or selected to explore.
2.2 Analysis Module Activities Once the user has selected epochs to analyze, he enters the Evaluation module, where he must
generate models to evaluate designs with respect to the current context and stakeholder needs.
The resulting metrics may be as simple as the level of a positive design attribute and initial cost
of manufacture, or a more involved aggregation of models such as Multi-Attribute Utility
(introduced by Keeney and Raiffa, 1976)16 and Multi-Attribute Expense (introduced by Diller,
2002)17. The Evaluation module relies on human judgment to pick appropriate evaluation
metrics, and possibly further computation to score alternatives with these metrics, which will
then enable the user to enter the Analyses module, where Single- and/or Multi- Epoch and/or Era
Analysis can be conducted. As mentioned in the thesis overview, Single-Epoch Analysis will not
be discussed in depth in this thesis, though we will present a section on this submodule here just
for comparison and completeness.
2.2.1 Single-­‐Epoch Analysis Submodule Single-Epoch Analysis allows evaluation of multiple designs in one epoch (a single combination
of contexts and stakeholder needs) at a time, based on any metric of the user’s choosing. The
approach used in this submodule is analogous to the aforementioned traditional tradespace
exploration based on design and epoch attributes, thus will similarly be limited its ability to
inform about a design’s performance over the entire system lifecycle. Criteria for evaluating this
submodule would be as follows:
•
The user should be able to evaluate and compare designs’ performance in a single epoch
by user’s choice of evaluation metric.
16
Keeney, R. L., & Raiffa, H. (1976). Decision with multiple objectives. Wiley, New York.
Diller, N. P. “Utilizing Multiple Attribute Tradespace Exploration with Concurrent Design for Creating
Aerospace Systems Requirements,” Master of Science Thesis, Aeronautics and Astronautics, Massachusetts Institute
of Technology, June 2002.
17
23
•
If the user chooses for the computer to perform a calculation, the user must be able to
explore how the computer arrived at results (and hopefully gain trust in results).
2.2.2 Multi-­‐Epoch Analysis Submodule Multi-Epoch Analysis allows evaluation of designs across all selected epochs of interest. All of
the analysis methods in this whole module are designed to be performed iteratively, so users can
use information learned from past analyses to inform future ones. (Pina, 2009)18, (Fitzgerald,
2012)19, (Schaffner 2014)20, etc. use extensions of Fuzzy Pareto Optimality (introduced by
Smaling, 200521; illustrated in Figure 2-1) to automatically calculate which designs appear close
to the Pareto Front for the highest percentage of epochs being analyzed. The user can calculate
this for any level of “fuzziness” (distance from the true Pareto Front), and pick the most fuzzyPareto efficient designs based on the results. This method, while highly useful, is not necessarily
interactive (besides setting the fuzziness factor), and requires trust in the computer’s calculations
and recommendations, which new users may not have gained yet.
Figure 2-1: Illustration of the concept of Fuzzy Pareto Optimality, where K is the level of “fuzziness”
applied to the Pareto front (left) to create the Fuzzy Pareto Front (shaded area, right). Graphic taken
from (Schaffner et al., 2014)
For the purposes of this thesis, all modules will target newer users, therefore goals for the MultiEpoch Analysis interface itself will center on promoting interactivity (comparing designs
visually and exploring further) to get a sense of the designs before computing the most fuzzy-
18
Pina, A.L. “Applying Epoch-Era Analysis for Homeowner Selection of Distributed Generation Power Systems,”
Master of Science Thesis, Engineering and Management, Massachusetts Institute of Technology, June 2014.
19
Fitzgerald, M.E. and Ross, A.M., "Mitigating Contextual Uncertainties with Valuable Changeability Analysis in
the Multi-Epoch Domain," 6th Annual IEEE Systems Conference, Vancouver, Canada, March 2012.
20
Schaffner, M.A., Ross, A.M., and Rhodes, D.H., "A Method for Selecting Affordable System Concepts: A Case
Application to Naval Ship Design," 12th Conference on Systems Engineering Research, Redondo Beach, CA,
March 2014.
21
Smaling, R.M. “System Architecture Analysis and Selection Under Uncertainty,” PhD thesis, Engineering
Systems Division, Massachusetts Institute of Technology, June 2005.
24
Pareto optimal designs. A byproduct of this interaction will hopefully be building user trust in
the computationally recommended results.
•
•
•
The user should be able to evaluate and compare system performance across selected
epochs a) by user’s choice of evaluation metric and b) simultaneously (without having to
switch screens).
The user should understand that epochs being analyzed are not necessarily sequential.
If the user chooses for the computer to perform a calculation, the user must be able to
explore how computer arrived at results (and hopefully gain trust in results).
2.2.3 Single-­‐Era Analysis Submodule Single-Era Analysis, also part of the Analyses module, allows evaluation of designs over the span
of an era, or a sequence of epochs with specified durations. Single-Era Analysis is similar to
Multi-Epoch Analysis in its design comparison objectives, but the fixed order of epochs within
an era allows decision-makers the ability to understand and utilize designs’ possibility for change
between epochs (changeability22), either manually (usually incurring additional expense) or
naturally, as well as the cumulative impact of time-varying metrics. Thus the major new benefit
of Single-Era Analysis is allowing the user to examine time- and path-dependence, as well as
identify design change strategies that keep the system delivering value even if it does not remain
in the same design state as it started in. Historically, the aforementioned fuzzy Pareto metrics
have also been used in this analysis, thus the goals for this submodule, like the last one, are
centered around promoting interactivity and building user trust as a byproduct.
•
•
•
•
The user should be able to evaluate system performance in the whole selected era by
user’s choice of a) the same evaluation metric or b) different evaluation metrics for each
composite epoch.
The user should understand that epochs being analyzed are sequential and thus be able to
understand path-dependent effects of epoch shifts.
The user should be able to understand effects of designs’ potential changeability from
epoch to epoch.
If the user chooses for the computer to perform a calculation, the user must be able to
explore how computer arrived at results (and hopefully gain trust in results).
2.2.4 Multi-­‐Era Analysis Submodule Logically following from the previous two analysis types, Multi-Era Analysis allows evaluation
of designs across all selected eras of interest. Similar to Single-Era Analysis, this submodule can
be very valuable in allowing a user to explore time- and path- dependencies and identify
strategies based on design changeability. The last process introduced in EEA, this type is the
22
Ross, A.M. and Hastings, D.E., "Assessing Changeability in Aerospace Systems Architecting and Design Using
Dynamic Multi-Attribute Tradespace Exploration," AIAA Space 2006, San Jose, CA, September 2006.
25
most complex of the analyses, relying on and synthesizing results gleaned from other analyses.
In his Master’s Thesis, Schaffner (2014) explores the process of performing Multi-Era Analysis,
stating that “the amount and variety of information that can be incorporated into [Multi-Era
Analysis] is significant,” going on to describe how the process takes different forms depending
on the inputs provided and the seven activities that can be performed as part of this process,
including identifying metrics of interest, creating design change strategies, creating eras, and
evaluating, or generating data for, each design-strategy-era combination23. For the purposes of
this thesis, we assume that any possible aforementioned processes given the available inputs are
done, and focus on the last of these processes: Results Analysis, or the exploration of system
behavior and trajectory based on what analyses have been conducted so far.
The user should be able to evaluate and compare system performance across selected eras by
user’s choice of a) the same evaluation metric or b) different evaluation metrics for each
composite epoch in the eras, and simultaneously (without having to switch screens).
•
•
•
•
The user should understand that eras being analyzed are not necessarily sequential, but
epochs within eras are.
The user should be able to understand effects of designs’ potential changeability from
epoch to epoch.
The user should be able to understand and compare time- and path-dependent effects of
epoch durations and shifts.
If the user chooses for the computer to perform a calculation, the user must be able to
explore how computer arrived at results (and hopefully gain trust in results).
23
Schaffner, M.A., “Designing Systems for Many Possible Futures: The RSC-based Method for Affordable
Concept Selection (RMACS), with Multi-Era Analysis,” Master of Science Thesis, Aeronautics and Astronautics,
Massachusetts Institute of Technology, June 2014.
26
Chapter 3: Usability Criteria Functionality is only one attribute of a system. Analogous to Ricci and Schaffner’s concepts of
“trust” and “truthfulness” in a model (Ricci et al. 2014)24, flawless functionality of an interface
(“truthfulness”) does not guarantee usability (“trust”). This usability can only be earned with
good interface design. Now that we have presented functionality criteria in the form of
submodule goals in the previous chapter, in this chapter we examine the usability criteria for
evaluating the interfaces that display these visualizations to users. Overall design principles for
creating good data graphics are introduced first, followed by usability criteria based on principles
for good user interface design.
3.1 Overall Design Considerations There are countless guidelines that have been developed over the years that prescribe measures
to be taken when creating data visualizations. These range from very general (e.g. important data
should be easy to find and understand; tell the truth about the data) to very specific (e.g. colors
should be chosen so that all, including color-blind, users can distinguish them; avoid using gray
scale to represent more than 2-4 values; words should be spelled out and run horizontally, left-toright). There should be internal consistency within the visualizations, as well as external
consistencies with common conventions the user may be familiar with. Visualizations should
attract anyone viewing them to think about the substance rather than the methodology or any
other distracting features. They should be clear and reveal the data at several levels of detail,
attracting and encouraging the user to explore further. Above all, they should enable the user to
be more productive, efficient, and/or gain more insight than they could have without the tool
(Ware 2013; Tufte 1983)25,26.
Especially when dealing with quantitative data, it is important to take into account how different
values are encoded to reflect their size or order. According to a 1984 study by Cleveland and
McGill, humans are most accurately able to encode quantitative data in the following ways, in
ranked order (Cleveland 1984)27:
1)
2)
3)
4)
5)
Position along a common scale (e.g. scatterplots)
Position along nonaligned scales (e.g. multiple scatterplots)
Length, direction, angle/slope (e.g. bar chart, pie chart)
Area (e.g. bubbles)
Volume, curvature (e.g. spheres)
24
Ricci, N., Schaffner, M.A., Ross, A.M., Rhodes, D.H., Fitzgerald, M.E., "Exploring Stakeholder Value Models
Via Interactive Visualization," 12th Conference on Systems Engineering Research, Redondo Beach, CA, March
2014.
25
Ware, Colin. Information Visualization: Perception for Design. Elsevier, 2013.
26
Tufte, Edward R. The Visual Display of Quantitative Information. Cheshire, Conn.: Graphics, 1983.
27
Cleveland, W.S. and R. McGill, “Graphical Perception: Theory, Experimentation, and Application to the
Development of Graphical Methods,” Journal of the American Statistical Association, 79-387, 1984.
27
6)
Shading, color saturation (e.g. heatmap)
Thus, representations of quantitative (or even some categorical) information should take this list
into account. While contrasting visual variables can add plenty of information, however,
visualization designers must constantly be wary of the tradeoff between information displayed
and simplicity. Selectivity and associativity are two properties that illustrate this tradeoff:
Selectivity is “the degree to which a single level of variable can be selected from the entire visual
field,” whereas associativity “refers to how easy it is to ignore the variable” (Miller 2015)28.
While important elements of a visualization should appropriately stand out, unnecessary
encoding of unimportant information should be avoided (without, of course, sacrificing too much
functionality).
While the more general guidelines are more obvious and widely accepted, more specific ones
should be treated with caution, as not all users share the same visual preferences or cognitive
processes. It is important to note here the differences between perception and cognition. Early
stages in human visual processing are largely automatic and based on general human perception,
independent of cognition: detecting basic figures, borders, and distinguishing a foreground from
a background. As this information reaches later stages of processing, it is combined with an
individual’s long-term visual memory to allow him or her to understand what he or she is seeing.
Thus later stages are thus influenced by an individual’s knowledge, or cognition.29 It is important
to optimize designs and visualizations for both human perception (general) and cognition
(individual).
We assume all of these guidelines strive to optimize the processing power of human cognitive
ability to understand whatever is being displayed. Tufte summarizes the concept of graphical
excellence as “that which gives to the viewer the greatest number of ideas in the shortest time
with the least ink in the smallest space.” For all of the types of visualizations we are about to
introduce, we credit that these baseline design criteria will be heeded and satisfied in the
implementations.
3.2 Usability For the purposes of this thesis, we will focus on three dimensions of usability: Learnability,
Efficiency, and Error-Tolerance. For each of these dimensions, we list a few questions to help
guide the evaluation thought process, followed by discussion about further concepts and metrics
related to the usability dimension. When evaluating the usability of the Epoch Sampling
submodule later on, we will pay special attention to the italicized concepts introduced in each of
these sections.
28
Miller, R. 6.831 User Interface Design and Implementation, Spring 2015. (Massachusetts Institute of
Technology: MIT Stellar <https://stellar.mit.edu/S/course/6/sp15/6.813/materials.html>)
29
Tacca, M. C. “Commonalities between Perception and Cognition,” Frontiers in Psychology, 2: 358. PMC. 2011.
28
3.2.1 Learnability The learnability of an interface refers to how easy it is for a new user to learn the complete
functionality of an interface without outside help. Some questions to consider when evaluating
learnability are:
•
•
•
•
Is the interface easy to learn at first?
How helpful is the interface?
Can tasks be completed and mastered without outside help?
Does the interface have built-in instructions or guidance?
While many pieces of technology were developed with the assumption that users would read a
manual or take a class first, that is increasingly not the case. More often than not, users are goaloriented, and will learn to operate a system by way of exploring how to complete tasks (learning
by doing) or by seeing others complete a task (learning by watching). If users need help from the
system along the way for whatever reason, the help must be searchable and goal-oriented in
order to be most effective. As visual cues are much easier to aid in user memory (“recognition” –
knowledge in the world) than no such help (“recall” – knowledge in the head), it is important
that systems somehow help the user rather than require the user to remember everything about its
operation. As mentioned in the previous section with functionality, consistency is important
within the interface as well as externally (so perhaps users can transfer existing knowledge from
other applications to aid in using this interface). Quick, visible system responses are also critical
so that users can get immediate feedback on whether or not they have actually done something.
If an interface has multiple states or modes, these should also be very apparent to the user (and
their transitions, if applicable). Finally, the interface should provide affordances, or the ability of
an object to appear that it can be used in a certain way. For example, a text box offers the
affordance that a user can click into it and type. Ideally, an object’s perceived properties to the
user should match its actual properties, so the user knows exactly what s/he is to do with the
object (Miller 2015).
3.2.2 Efficiency The efficiency of an interface refers to how fast it is for returning users to navigate and perform
tasks using the interface. Some questions to consider when evaluating efficiency are:
•
•
•
•
Once learned, is the interface fast to use?
How long does it take to complete common tasks?
Does the interface feel efficient to users?
Are there bottlenecks or shortcuts?
Once a user is familiar with a system, s/he tends to group parts of it in a unit of memory. This is
called “chunking,” and good interfaces should present information in such chunks that are easily
recognizable by the user. The interface should also be fast to navigate, in terms of pointing and
29
steering. Fitts’s Law, T = a + b*log(D/S+1) = RT + MT, represents the time T it takes to move
your hand to a target of size S and distance D, or the reaction time RT plus the movement time
MT. This law for pointing has many implications for interface design to speed up pointing time,
such as the fact that targets at the edge of the screen are easy to hit, whereas unclickable margins
require increased accuracy. To aid with pointing efficiency, it is good to make frequently-used
targets bigger and put them near each other. There is a similar law for steering, T = a + b*D/S,
representing the time T that it takes to move your hand through a tunnel of length D and width S.
The index of difficulty, represented by the constant b, is now linear instead of logarithmic,
showing that steering is much harder than pointing. Thus things like requiring the user to steer
through narrow tunnels on the screen will severely damage efficiency. Keyboard shortcuts or
anticipating the user’s next movement (e.g. autocomplete) also help users perform tasks faster
(Miller 2015).
3.2.3 Error-­‐Tolerance The error-tolerance, or safety, of an interface deals with how the interface prevents and covers up
any errors users make while using the interface. Some questions to consider when evaluating
error-tolerance are:
•
•
•
Are errors few and recoverable?
Does the interface help to prevent errors?
Does the interface help when errors occur?
Human error is unavoidable. Slips (failure of execution) and lapses (failure of memory) are fairly
common simply due to inattention, but interfaces should take measures to prevent complete
mistakes (using the wrong procedure for a goal). Some ways of accomplishing this are avoiding
actions with similar descriptions, avoiding habitual action sequences with identical prefixes,
and/or adding confirmation dialogs, clearly marked exits, manual overrides, error messages or
the ability to undo (Miller 2015).
30
Chapter 4: Visual Analytics Leo Cherne has been credited with saying, “The computer is incredibly fast, accurate, and stupid.
Man is unbelievably slow, inaccurate, and brilliant. The marriage of the two is a force beyond
calculation.”30 The field of visual analytics aims to leverage such a marriage for vast potential
knowledge gain. Visual Analytics, according to Keim, et al. (2008), can be described as “an
iterative process that involves information gathering, data preprocessing, knowledge
representation, interaction and decision making.” To reach the ultimate goal of gaining insight
into a problem described by a large amount of data, this field “combines the strengths of
machines with those of humans”31. A graphic of the visual analytics process is shown in Figure
4-0. As shown, there are multiple ways one can go through this process, and more feedback
loops can result in more informed analysis.
Figure 4-0: The visual analytics process, taken from (Keim, et al. 2010).
Well-defined problems with clear sets of rules are poor choices for visual analytics since
computers can simply be programmed to arrive at an optimal answer without human input.
Similarly, exploratory problems with small amounts of data that humans can sift through to
arrive at an optimal answer are also poor choices for visual analytics as they do not require the
30
Chang, Remco. “Big Data Visual Analytics: A User-Centric Approach” [PowerPoint Slides].
Keim, D. A., Mansmann, F., Schneidewind, J., Thomas, J., & Ziegler, H. (2008). Visual Analytics : Scope and
Challenges. In Visual Data Mining (pp. 76–90).
31
31
aid of a computer. Thus, good applications of visual analytics require harnessing the
computational power of a computer with the insight and intelligence of a human, making it a
powerful tool for the type of exploratory data analysis needed in IEEA.
Keim et al. (2008) present a “visual analytics mantra” to describe the process:
“Analyse First –
Show the Important –
Zoom, Filter and Analyse Further –
Details on Demand”32
By definition, visual analytics requires the use of visualizations that humans can interact with.
The rest of this chapter will now focus on presenting a survey of some existing data visualization
techniques, focusing on those intended for multidimensional (referring to the dimensionality of
independent variables) and multivariate (referring to that of dependent variables) datasets, as
well as some techniques for interacting with data. It is important to note that this thesis does not
present an exhaustive list of all possible visualizations or data manipulation techniques, but
rather a selection of fairly basic and common visualizations that carry potential to be used in
IEEA modules.
The visualizations are grouped into four types, discussed one per section: Geometric, PixelBased, Icon-Based, and Hierarchy-Based. These will mainly be static visualizations, but those
that necessarily involve interaction will be marked as such. The last section will be devoted to
discussing interaction schemes. As mentioned in Section 3.1 above, we assume that the
previously discussed baseline design criteria will be satisfied in any implementations of all of the
visualizations being presented. In the following chapters, we will pick a few different
[interactive] visualizations for each IEEA submodule we present, and assess the extent to which
they meet the functionality goals (as presented in the Chapter 2) for each respective submodule.
4.1 Geometric Visualizations Geometric visualizations are perhaps the most common and broad category of visualizations,
mapping data attributes to a two- (or sometimes three-) dimensional surface (Chan, 2006)33.
These include scatterplots, line graphs, parallel coordinate plots (including polar charts), force
diagrams, and Sankey diagrams.
4.1.1 Scatterplots As seen from Cleveland and McGill’s aforementioned study, people are most accurately able to
decode information when it is represented by position along a common scale, making a
scatterplot a good place to start. A scatterplot allows one variable to be mapped on each axis, so
32
Keim, D. A., Mansmann, F., Schneidewind, J., Thomas, J., & Ziegler, H. (2008). Visual Analytics : Scope and
Challenges. In Visual Data Mining (pp. 76–90).
33
Chan, W.W. “A Survey on Multivariate Data Visualization.” Department of Computer Science and Engineering,
Hong Kong University of Science and Technology, June 2006.
32
each point’s location easily encodes two dimensions of its characteristics to new users. An
example is shown in Figure 4-1.
Figure 4-1: Example of scatterplot
4.1.1.1 Bubble Charts and Motion Charts A bubble chart is a scatterplot with two additional values encoded with color and size. Additional
visual variables such as shape and orientation can encode more variables, but as this may
overwhelm a user with information, bubble charts are most commonly thought of as simply
x/y/color/size charts. Figure 4-2 shows an example of such a bubble chart.
Motion charts are yet another extension of scatterplots/bubble charts, animating the trajectory of
points as another variable changes (most commonly, time). If the dataset that produced Figure 42 had multiple years’ worth of similarly stored data, this visualization can easily be turned into a
motion chart, where the x-position, y-position, color, and/or size vary as time t increases.
33
Figure 4-2: Example of bubble chart
4.1.1.2 Scatterplot Matrices Scatterplot matrices are essentially just what they sound like: a grid of 2D scatterplots, usually
utilized with high dimensional data to view “cross sections,” or pairwise relationships between
dimensional attributes. Generally all horizontal axes in a row and vertical axes in a column
represent the same variable on the same scale to reduce confusion in what can already be a
cluttered set of plots (Hoffman, 1999). An example scatterplot matrix is shown in Figure 4-3.
Notice that the entire grid is symmetrical across the top-left to bottom-right diagonal, so only the
lower or upper triangle is needed to convey the same amount of information.
There are many variants of scatterplot matrices. Some recognize the ineffectiveness of plotting
variables against themselves (along the diagonal) if there are more points than distinguishable.
Figure 4-4 shows a scatterplot matrix with histograms of the value distribution for each variable
plotted along its diagonal instead.
34
Figure 4-3: Scatterplot Matrix of a 6-dimensional car dataset, with variables plotted pairwise, from
(Hoffman, 1999)34
Figure 4-4: Scatterplot Matrix with histograms plotted along diagonal, from (Grinstein, 2001)35
34
Hoffman, P.E. “Table Visualizations: A Formal Model and Its Applications”, Doctoral Dissertation, Computer
Science Department, University of Massachusetts at Lowell, 1999.
35
Grinstein, G., Trutschl, M., and Cvek, U. “High-Dimensional Visualizations.” 7th Data Mining Conference-KDD
2001.
35
4.1.2 Line Graphs Line graphs are very similar to scatterplots, but in line graphs, the data points for each value of
the independent variable are connected together to form a line, highlighting the local change
between pairs of adjacent points and the overall trend. For this reason, line graphs are especially
useful in visualizing data with an ordinal independent variable, such as time series data. Multiple
variables can be encoded, similar to scatterplots, using line color and shape of points, as long as
adding these variables does not detract from the interpretation of the relationship between
adjacent points. For example, Figure 4-5 shows a line graph with two independent variables:
month (on the x-axis) and year (represented by color and shape as seen in the key). The multiple
lines on the same plot allow a viewer to easily spot trends within a single year as well as
differences and common patterns across different years.
Figure 4-5: Line graph with multiple lines, from (Wallace 2004) 36
4.1.3 Parallel Coordinate Plots Parallel coordinate plots, as shown in Figure 4-6, display high-dimensional data by representing
each variable on a vertical axis (in Fig. 4-6, these variables are “Sepal Width,” “Sepal Length,”
“Petal Width,” and “Petal Length”) that are not necessarily scaled the same. An individual line
spanning the axes represents the point that takes the values of each variable it intersects (for
example, following the red line at the top of the “Sepal Width” axis, the corresponding entry
36
Wallace, Rosa. “Graphic Resources.” NC State University <https://www.ncsu.edu/labwrite/res/gh/ghlinegraph.html> 2004.
36
seems to have the following approximate values: Sepal Width – 4.4, Sepal Length – 5.8, Petal
Width – 0.4, Petal Length – 1.5). Additional characteristics can be encoded in color, as in Fig. 45, but are not necessary. Parallel coordinate plots are quite effective at representing and revealing
patterns in high-dimensional data when each data point has slightly different values, however the
horizontal order of the axes may affect interpretation, as different patterns may emerge with
different orders (Chan, 2006; Grinstein, 2001).
Figure 4-6: Example of Parallel Coordinate Plot, from Wikipedia37
4.1.3.1 Polar charts Polar charts, also known as spider charts or kiviat diagrams, are a circular extension of parallel
coordinates, essentially pinching the coordinate axes around into a circle, creating a wrap-around
version of Figure 4-6 where each design is now represented as a circle (Hoffman 1999)38. An
example of this kind of visualization is shown in Figure 4-7.
37
en.wikipedia.org/wiki/Parallel_coordinates
Hoffman, P.E. “Table Visualizations: A Formal Model and Its Applications”, Doctoral Dissertation, Computer
Science Department, University of Massachusetts at Lowell, 1999.
38
37
Figure 4-7: Polar chart showing Iris Flower dataset (left) and RadViz showing example car dataset
(right). Both images taken from (Hoffman 1999).
4.1.3.2 RadViz A similar idea is the Radial Coordinate Visualization (RadViz), proposed by (Hoffman 1999),
where n-dimensional data can be plotted on n axes emanating outwards from the center of a
circle and ending on the circle’s perimeter, also seen in Figure 4-7. Axes are normalized and data
points are plotted as if attached by a separate spring to its axis intersection points from the polar
chart. This kind of visualization minimizes clutter and allows for easy spotting of outliers,
irregularities, or patterns, though the location of the points largely depends on the organization
and order of the axes.
4.1.4 Force Diagrams (Interactive) A force diagram is an interactive graph layout in which all data points form connected
components that behave as though they were attached by a spring. This visualization, supported
by d3.js39, relies on physical simulation to allow users to explore to what extent data points affect
the other data points in the visualization. An example of two different positions of the same
force-directed graph is shown in Figure 4-8, in which data points are characters in Victor Hugo’s
Les Miserables, and connections between points represent the fact that the characters appear in a
scene together. When a user clicks on points around the periphery, such as the pink points at the
bottom of the left pane, the whole graph does not move much. However, when a more centrally
connected point, such as the light blue point in the middle representing Valjean (the main
character) is moved, the whole graph is affected by it, as seen in the right pane.
39
Data-Driven Documents, d3js.org
38
Figure 4-8: Force-Directed Graph depicting character co-occurrence in “Les Miserables,” from
(Bostock, 2012)40
4.1.5 Sankey Diagrams Sankey diagrams are widely used in fields such as chemical engineering that have an abundance
of processes in which heat, energy, or other quantities flow between nodes. To represent the
volume of flow between nodes, one-directional arrows have widths proportional to the flow
quantity they represent41,42. Sankey diagrams do a very good job of representing a particular kind
of data, but are close to nonexistent in fields that do not deal with flow data. An example of this
kind of diagram is shown in Figure 4-9. This visualization is discussed for the sake of
completeness, but also because different ways of encoding nodes may make this diagram useful
for systems engineering decision-making purposes.
40
Bostock, M. “Force-Directed Graph” <http://bl.ocks.org/mbostock/4062045> Nov 2012.
Bostock, M. “Sankey Diagrams” <http://bost.ocks.org/mike/sankey/> May 2012.
42
Sankey Diagrams. “Sankey Definitions” <http://www.sankey-diagrams.com/sankey-definitions/>
41
39
Figure 4-9: Sankey diagram showing a possible scenario for UK energy production and consumption in
2050, with supply on the left and demands on the right, from (Bostock, 2012)
4.2 Pixel-­‐Based Visualizations Pixel-oriented visualizations map data attributes to pixels based on a color scale. This tends to
get rid of visual noise as lots of dimensional information can be encoded in such a small space
(Chan, 2006). These include pixel bar charts and color mapping (heatmaps).
4.2.1 Pixel Bar Charts A bar chart, like a scatterplot, is an extremely common type of geometric-based visualization in
everyday graphics, displaying a rectangle for each type of data point, whose length is
proportional to its (1D) value. There are many variants on bar charts (e.g. plotted vertically or
horizontally, cumulative, stacked, etc.), but a pixel-based bar chart can encode additional
information (for additional variables) by representing each individual data point within a bar
with a color pixel, as seen in Figure 4-10. Thus, along with the total number of data items per
type, pixel bar charts allow individual attribute values to be seen at-a-glance.
40
Figure 4-10: Equal-height pixel bar chart with color encoding different attributes, from (Chan, 2006)
4.2.2 Color Mapping (Heatmaps) By now, encoding additional attribute values with color or hue is not a new idea. However, for
completeness, the most straightforward single-pixel-encoding method is presented: heatmaps.
For a given 2D area, whether part of another visualization (as in pixel bar charts above) or not,
every pixel represents a data point, and the value of that pixel represents its data point’s value.
Care must be taken if the value of a pixel is represented by color, as the color wheel does not
actually have a “natural order,” due to the nonstandard and/or cyclic nature of many color
palettes. Another concern with using color is user color-blindness, or ability to tell different
colors apart. A more standard way to represent value of a pixel is by hue, or saturation of color.
No matter what the base color, there is a natural order from light to dark, and color-blindness is
no longer an issue to consider (Spears, 1999)43. A couple of examples of heatmaps are shown in
Figure 4-11, one using hue (in this case, grayscale) and one using color (over an actual map).
43
Spears, W.M. “An Overview of Multidimensional Visualization Techniques.” Evolutionary Computation
Visualization Workshop, 1999.
41
Figure 4-11: Heatmaps encoding data in every pixel. Random data set encoded into a 10x10 pixel square
(left) from (Grinstein, 2001), and local thermal power data encoded into a map of the whole US (right)
from (ICM Consulting 2015)44
4.3 Icon-­‐Based Visualizations Much like encoding information in hue or color, it is possible to encode additional information in
shape, which is the idea behind iconography, or icon-based visualization techniques. Data items
are mapped to icons, or glyphs, whose shape and features differ depending on attribute values.
While hue-encoding worked well for quantitative data, iconography works better for categorical
data, as shape features are also more categorical in that they do not necessarily have an order.
While humans generally recognize graphical features more than simple geometric
shapes/patterns, this only works up to a smaller volume of data than can be displayed with
geometric techniques, so data sizes for icon-based visualizations are generally on the smaller
side. As a disclaimer, Chan points out that while geometric techniques “treat all the dimensions
equally, some features in glyphs are more salient than others, adjacent elements are easier to be
related and accuracy of perceiving different graphical attributes varies between humans
tremendously. It thereby introduces biases in interpreting the result” (Chan, 2006).
Icon-based visualizations include star plots, Chernoff faces, stick figures, and color/shape icons.
4.3.1 Star Plots Introduced by Chambers, et al. in 1983, star plots are star-shaped figures that can display n
dimensions using n rays emanating from the center of the glyph. All variables can be displayed
in each figure, and the length of the rays are proportional to the values of the variables they
represent. There is no standard for the way the data are arranged on a page, but usually the
figures are placed into a rectangular array with some sort of grouping or ordering based on
variable values to make certain trends, groupings, or features apparent at a glance. Figure 4-12
44
http://icmconsulting.com/media/uploads/Geothermal_heat_map_US.png
42
shows an example of a 12-dimensional car dataset with 36 points. The ray emanating straight
downward from the center represents “weight,” so it is noticeable that the data in this figure are
arranged such that the lightest cars are at the top of the figure, while the heaviest are at the
bottom (Friendly, 1991)45.
Figure 4-12: 36 twelve-dimensional data points represented as star plots, and organized by “weight”
(bottom variable), from (Friendly, 1991)
4.3.2 Chernoff Faces Another famous icon-based visualization is the set of Chernoff faces, named after their inventor
Herman Chernoff (1973), where each data point is represented as a human face. This was
originally proposed because of humans’ natural ability to differentiate between and recognize
human faces. Data attributes are mapped to different facial features: head eccentricity, eye
eccentricity, pupil size, eyebrow slant, nose size, mouth shape, eye spacing, eye size, mouth
length, and degree of mouth opening. These faces may again be arranged any way: randomly, on
a rectangular grid, ordered to bring out a salient feature, or on a scatterplot (Liu 2014; Spears,
1999; Chan, 2006)46. Figure 4-13 shows a set of 12 different facial features as well as a
scatterplot of Chernoff faces. It is important to note that the assignment of dimensions to facial
45
Friendly, M. “Statistical Graphics for Multivariate Data.” SAS SUGI 16 Conference, Apr 1991.
Liu, Y. “Visualization of Multivariate Data” Department of Biomedical, Industrial and Human Factors
Engineering, Wright State University, Fall 2014.
<http://www.stat.sc.edu/~hansont/stat730/MultivariateDataVisualization.pdf>
46
43
attributes really matters in this visualization, as human facial recognition ability can introduce
strong bias into the interpretation of the faces.
Figure 4-13: Different Chernoff facial features (left) and Chernoff faces plotted in various 2D positions
on a scatterplot (right), taken from (Chan, 2006)
4.3.3 Stick Figures Similar to Chernoff faces, stick figures (introduced by Pickett & Grinstein, 1988) encode
attributes in the angle, length, thickness, or color of the 5 “limbs” in the body of a stick figure.
Usually stick figures are plotted on a scatterplot with the two most important attributes being
represented in the x- and y- position of the figure on the plot (Chan 2006; Liu 2014). Figure 4-14
shows an example family of 12 stick figures (with 10 features - angle and length for each limb),
as well as a full scatterplot of properly positioned stick figures.
Figure 4-14: A family of 12 stick figures (left) and a scatterplot of stick figures (right), taken from (Liu
2014)
44
4.3.4 Color/Shape Icons Color icons are essentially a hybrid between heatmaps and icons, assigning a pixel or region of
the icon to each attribute, and encoding the value through color or texture. (Chan, 2006). The
idea of icon hybrids opens up a world of extensions, where attributes can be encoded by any
visual variable (e.g. color, hue, orientation, shape, texture, size, etc.) on an icon of any inherent
shape, size, etc. for additional dimensionality mapping.
4.4 Hierarchy-­‐Based Visualizations Hierarchical techniques all require data to be organized in a format such that each data point
belongs to a certain “level” or has a parent and/or children nodes. In other words, data must be
structured hierarchically. This allows the space to be subdivided recursively and present
logarithmically more information in the same space (Chan, 2006). Hierarchy-based
visualizations include hierarchical axes, trees, treemaps, and circle packing.
4.4.1 Hierarchical Axes This technique partitions the display axis repeatedly, plotting (“stacking”) data elements within
other data elements, with the most important variable being plotted first, then the next most
important being plotted within that, etc. Color coding the final visualization helps distinguish
layers, and can also perform double-duty by encoding additional information if necessary (Chan,
2006). Figure 4-15 shows both the splitting scheme of the axis as well as the final visualization.
Figure 4-15: Splitting scheme of hierarchical axes (left) next to the final histograms-within-histograms
matrix visualization (right), from (Chan, 2006)
4.4.2 Trees Trees make it fairly straightforward to visualize all of the options at each variable, making this
option very useful for hierarchical data. (This visualization is also very well supported in D3.js!)
To find the characteristics of a certain leaf node (at the bottom of a tree), one traverses up the
path to the root from the leaf; Similarly to find a node with specified values for each variable,
traverse down the corresponding paths from the root to reach that node. An example diagram of
such a visualization is shown in Figure 4-16.
45
Figure 4-16: Example unlabeled tree visualization, from (BigML Blog 2012)47
4.4.3 Treemaps Treemaps are another hierarchical layout also supported by d3.js in which tree nodes are
represented by rectangles, and “parent” rectangles are recursively partitioned into smaller
“children” rectangles, much like a 2D version of hierarchical axes. Again, this is obviously very
effective for hierarchical data, as well as representing all of a tree’s leaf nodes compactly (Wang,
2006)48. There is also, of course, the option of encoding additional values in shapes’ color and
size to reveal more attributes of leaf nodes. Treemaps are generally good for representing trees
when the distribution of children is non-uniform, so that they can display the variety at a glance.
An example of such a treemap is shown in Figure 4-17, displaying the populations of all the
countries in the world. The tree structure in this example stores the six continents as the children
of the root, and each continent’s children are all the countries that belong to that continent. The
divisions between continents in the figure are denoted with bold black lines, whereas those
between countries are simply gray. In this example, the country’s population is encoded in area
and its Gross National Income (GNI) is encoded in color.
47
BigML Blog. Jan 2012. <https://littleml.files.wordpress.com/2012/01/screen-shot-2012-01-23-at-10-00-17am1.png>
48
Wang, Y., Teoh, S.T., Ma, K. “Evaluating the Effectiveness of Tree Visualization Systems for Knowledge
Discovery.” Eurographics/IEEE-VGTC Symposium on Visualization, 2006.
46
Figure 4-17: Example treemap of country population by continent, from (Veroy 2013)49
4.4.4 Circle Packing Circle packing is yet another hierarchical layout supported by d3.js in which tree nodes are
represented by shapes. In this visualization, children nodes are recursively packed into parent
circles to fill the area as compactly as possible, again proving very effective for hierarchical data.
Again, there is the option of encoding additional values in shapes’ color and size. Figure 4-18
shows an example of a circle packing layout (the original circle packing tutorial on the D3 page,
49
Veroy, R. L. Feb 2013. <http://www.eecs.tufts.edu/~rveroy/stuff/GNI2010-treemap.png>
47
in fact), showing the Flare50 class hierarchy. The largest (outermost) circle represents the root
node. Bigger circles encompass all their children, which in turn encompass all their children until
the tree’s leaves are reached (the smallest circles). In this example, the leaves have been colored
orange while all intermediate nodes are shades of blue.
Figure 4-18: Example circle packing layout from Mike Bostock’s website51
50
51
Flare Data Visualization library, http://flare.prefuse.org/
http://bl.ocks.org/mbostock/raw/4063530/
48
4.5 Data Interaction Techniques As human interaction with data is a fundamental requirement of visual analytics, this chapter
would not be complete without a discussion of techniques to use in interacting with data. Direct
data manipulation allows a user to interact with data and change some aspect about how it is
displayed. Though multimodal interfaces can allow many ways for users to interact with data
(e.g. speech, gestures, touch), we now present two solely mouse-based interaction techniques,
namely drag and drop and selection. These two fundamental techniques can act as building
blocks for more complex interactions, such as sorting, resizing, toggling, filtering, and brushing,
which will be discussed below.
4.5.1 Drag and Drop A simple way to move objects around on screen is by clicking, dragging, and releasing the
mouse where the object is intended to land. JQueryUI52 offers an API for easy implementation of
this technique, allowing the creation of draggable elements and drop targets for draggable
elements. Dragging and dropping enables the real-life metaphor of picking up objects and
changing around their locations, making it easy to learn and efficient to use. Drag and drop is
easy to undo, as a user can simply drag an item back to where it started if a move was
unintended.
4.5.1.1 Sorting Dragging and dropping additionally facilitates the ability to sort data, or reorder items in a list or
grid directly using the mouse. On top of simply changing the display of the interface, sorting can
be linked to a backend to manipulate the state of a database based on the new order of the list or
grid of items.
4.5.1.2 Resizing An important property of some interface elements is their size. To enable a user to directly
change the size of an object, JQueryUI also offers a separate API for object resizing. Resizing
easily allows a user to stretch or squeeze an object through drag and drop (supporting easy
undoing of the action). By convention, the affordance for resizability is usually represented by a
grooved border or corner (derived from the real-life metaphor of being able to “grip” a corner),
as seen in Figure 4-19. Thus, implementations with the capability to resize objects should take
this into consideration for easy learnability.
52
www.jqueryui.com
49
Figure 4-19: An example resizable object, denoted by the grooved “grippable” corner
4.5.2 Selection A fundamental capability to enable a user to choose items and show that those items have been
chosen is selection. Selection can be implemented in many ways, depending on how it will be
used in an interface. The most straightforward selection method is by clicking on objects to
select them, and having them change some aspect of their representation (size, color, etc.) to
indicate they have been selected. This can allow selection of either one or multiple items in an
interface (though for easy learnability, there should be some indication of which is supported). A
common implementation of this is through checkboxes (which allow multiple items to be
selected) and radio buttons (which only allow one item to be selected), as illustrated in Figure 420.
Figure 4-20: An example of checkboxes vs. radio buttons, from (Lepofsky 2015)53
Selection can also be implemented by allowing a user to “draw” a box or a lasso over all
elements he wishes to select. JQueryUI again offers an API for this method of selectability. The
ability to deselect items (and recognize deselection) is an important feature of safe interfaces.
Once items are selected, they can be highlighted and can remain distinguished from the rest of
the items until deselected.
Selection is an instrumental building block for many other common kinds of interaction,
including toggling, filtering, and brushing, described below. To make selection (or any
53
Lepofsky, A. “In The Next Version” 2015. <http://www.alanlepofsky.net/alepofsky/alanblog.nsf/dx/lotus-notesbasics-checkboxes-and-radio-buttons/content/M2?OpenElement>
50
application of selection) useful, it is generally linked to a backend (as with sorting) that will
respond, passively or actively, based on which items are selected, making it an extremely useful
technique for database manipulation.
4.5.2.1 Toggling Toggling is the action of switching between (usually two) properties or states based on which is
selected. The switch is most cleanly activated using a marked toggle switch (examples shown in
Figure 4-21), but can also be triggered on selecting or simply clicking an interface element which
serves the same purpose as the toggle switch.
Figure 4-21: An example of toggle switches, from XOO.me design directory54
When designing for error-tolerance, it is important to note that a common type of error is “mode
error” (also called “state error”), and results when the same actions or displays mean different
things in different states and a user confuses them. The best way to avoid mode errors is to
completely eliminate modes, but if that is not possible, increasing visibility of modes or
designing so that no two modes share any actions.
4.5.2.2 Filtering A very useful type of interaction that selection can aid with is filtering, or screening information
by only displaying what is selected. Filters can be implemented in several different ways, but the
goal is to allow users to easily choose what subset of the data they wish to see displayed. For
data that can be described with any kind of variable (discrete, continuous, categorical,
numerical), filtering could simply involve selecting the variables corresponding to the desired
data, however they are represented (objects, checkboxes, dropdown menus, etc.) by any method
above. Figure 4-22 shows some example filters used on an actual online clothing retail website.
54
XOO.me <http://xoo.me/template/details/11917-6-web-ui-toggle-switches-set-psd>
51
Sliders, allowing users to set a minimum and maximum variable value as demonstrated in Figure
4-22, are also an efficient filtering tool for primarily continuous numerical variables.
Figure 4-22: Examples of filters for different types of variables: Size, Designer, and Color all allow
discrete selection of respective values (numerical and categorical), and the slider (bottom right) allows
selection for the continuous variable Price (as shown here, the selection allows values from $0-$750). All
taken from an actual retail website, Rent the Runway55
4.5.2.3 Brushing Data brushing enables the display of the same selected data in two or more visualizations
simultaneously. The selected data is generally displayed with the same appearance so that it is
easily apparent to users which data points correspond to one another across visualizations. An
example of data brushing across a scatterplot matrix is shown in Figure 4-23. Brushing is an
effective interaction technique to link or coordinate multiple static visualizations.
55
www.renttherunway.com
52
Figure 4-23: An example of data brushing, taken from Mike Bostock’s website56. The data was selected in
the top-left box, and is colored the same across all scatterplots in the matrix.
The techniques mentioned in this chapter are by no means an exhaustive list of visualizations or
interactions, but hopefully provide a starting point to think about how certain datasets may be
represented for exploration and analysis by the visual analytics process. Different levels of Keim,
et al.’s visual analytics mantra may be best represented by multiple interactive visualizations,
either in combination or in sequence, so it is important to keep in mind for the rest of the
discussion in this thesis that often there are multiple techniques that may work for a certain
purpose, rather than one “best” visualization.
Table 4-1 now summarizes the capabilities of the visualization techniques presented in this
chapter to convey characteristics, strengths, and applications that they are good for at a glance.
56
http://bl.ocks.org/mbostock/4063663
53
Technique
Name
Capabilities
Supported Supported Supported Interactions
#Dims
Var Type Dataset Size Supported
Scatterplot
Viewing trends,
slope
2
Any
Any
Sort, Select,
Filter
Bubble Chart
Viewing
patterns/trends
4-5
Any
Small-Med
Sort, Select,
Filter
Scatterplot
Matrix
Linking graphs
Multi
Any
Any
Sort, Select,
Filter, Brush
Line Graph
Time series
representation
3-4
Any (esp
Ordinal)
Small-Med
Sort, Select,
Filter
Parallel
Coords
Multidimensional
trend/pattern recog
Multi
Any
Any
Sort, Filter
Polar
Charts/RadViz
Multidimensional
trend/pattern recog
Multi
Any
Any
Sort, Filter
Force
Diagram
Displays all points
in dataset
N/A
Discrete
Any
Drag, Sort,
Select
Sankey
Diagram
Volume of flow
between nodes
Multi
Discrete
(needs
flow amt.)
Pixel-Based
Methods
Encodes data with
color
Multi
Any
Any
Sort, Filter
Icons/Glyphs
Encodes data with
features
Multi (to
an extent)
Discrete
Small-Med
Sort, Filter
Hierarchical
Axes
Can view whole
dataset in stacks
Multi (to
an extent)
Any
Any
Sort, Filter
Trees
Can view whole
dataset by path
Multi
Discrete
Any
Sort, Filter
Treemaps
View whole dataset
by compartment
Multi
Discrete
Any
Sort, Filter
Circle Packing
View whole dataset
by compartment
Multi
Discrete
Any
Sort, Filter
Any (best w/ Drag, Sort,
Small-Med) Select, Filter
Table 4-1: Summary of visualizations presented in this chapter (Sections 4.1-4.4). Includes visualization
names, brief notes about their major strengths/capabilities, the number of dimensions supported (either a
number or number range, multidimensional [meaning 2+], or in the case of Force Diagrams, not
54
applicable), the variable types supported (Discrete, Continuous, or Any), the dataset size supported
(Small, Med, Large, or Any), and the types of interactions supported from those presented in Section 4.5.
55
56
Chapter 5: Functionality and Usability Examination of IEEA Epoch Sampling Submodule In this chapter, we evaluate five visualizations with respect to the functionality goals for Epoch
Sampling as asserted in Chapter 2. After presenting the best visualization[s] to meet these goals,
we describe an implementation of the submodule and discuss its usability.
5.1 Functionality Recall from above the functionality goals for the Epoch Sampling submodule, summarized here:
Goal #1:
help user understand specific epoch definitions
Goal #2:
help user find and select epoch(s)
Goal #3:
help user understand:
a) epoch space size,
b) fraction available to explore,
c) fraction already explored
A selection of visualization techniques is now described. For each of these, we evaluate the
strengths and weaknesses of the visualization with respect to the three listed goals. For this
particular submodule, we will also evaluate the visualizations separately for their overall
suitability for two dimensions and greater than two dimensions.
5.1.1 Scatterplots/Bubble Charts Scatterplots are virtually the best way to represent two-dimensional data, so if there are only two
epoch variables, a scatterplot may be the best way to visually represent the entire epoch space:
They clearly show the combination of two epoch variables defining each epoch (Goal #1), and
based on this, a user can easily locate and select points of interest (Goal #2). All possible epoch
points can be displayed, with different hues/saturation/shading representing the respective points
that can and have been explored, so the user can get a sense of the whole epoch space (Goal #3).
Enabling dragging over several epochs to select them all would increase selection efficiency as
well. A rudimentary implementation of an interface using a scatterplot using 16 epochs is shown
in Figure 5-1.
57
Figure 5-1: Example of IEEA Epoch Sampling implemented as a scatterplot. The epoch variables were
“Tech Level,” with values “future” or “present,” and “User Preference,” with values 1-8.
If there are more than two variables to plot, however, epochs will not all have unique locations.
Even if more dimensions are encoded by size and color as in a bubble chart, points representing
epochs will still be on top of each other in x-y space, so users will not have a clear way of
separating them spatially to find and select, or to know how deep the epoch space actually goes,
impeding Goal #1 and failing Goals #2 and #3. The same problem arises if additional dimensions
are encoded in a unique icon or glyph. Though the option then presents itself to simply display
all such icons in a group rather than plot them to avoid the same locations on the same axes, for
higher-dimensional epoch datasets, the full enumeration of possible icons does not help users
find specific epochs (Goal #2), due to the fact that shape is not a visual variable that lends to
selectivity.
5.1.2 Parallel Coordinate Plots Parallel coordinate plots are quite effective at representing and revealing patterns in highdimensional data when each data point has slightly different values. However, since epochs are
generally generated as a full factorial of epoch variable combinations, many share more than one
variable value, causing this representation to suffer from the same spatial ambiguity problem as
greater-than-two-dimensional scatterplots. In other words, the enumeration of epoch variables
58
causes each segment between adjacent axes to be shared among many epochs, again hiding
epochs that share the same segment, or location, from the user. An example sketch of this is
shown in Figure 5-2, using the five epoch variables from the Next Generation Combat Ship
(NGCS) database developed by Schofield: VUAV, SmallBoatSize, EngineEmissions,
RangeIncrease, and IceRegionUse (Schofield 2010; Schaffner 2014)57. As an example, two
epochs that share the same values for VUAV and SmallBoatSize will share their first segment,
but users would not be able to tell that the segment encoded more than one entry, skewing
analysis. For this reason, parallel coordinate plots, as with higher-dimension scatterplots, impede
Goal #1 and fail Goals #2 and #3.
Figure 5-2: Example of IEEA Epoch Sampling sketched as a Parallel Coordinate Plot
5.1.3 Trees For the remainder of this functionality analysis, visualization techniques presented are
hierarchical and therefore require epoch variable data to be stored in a tree structure to pass into
built-in D3.js layouts. In our tree implementation, we organize the data so that each epoch
variable is a fixed depth into the tree, and the epochs are the leaves. One such implementation,
using the five epoch variables from the NGCS database, is shown in Figure 5-3. At each node,
the variable name and value is displayed. Clicking nodes adds all descendant epochs to the list of
selected epochs (e.g. Clicking Epoch #8’s parent node ‘IceRegionUse: high’ would only select
Epoch #8, whereas clicking the root node ‘All Epochs’ would select all 108 enumerated epochs).
The variables’ levels in the tree should be reorderable for easier mass selection (e.g. if I only
want to select epochs with IceRegionUse: high, I can reorder IceRegionUse to be at the top of
the tree, and only expand out that node).
57
Schofield, D.M. “A framework and methodology for enhancing operational requirements development: Unites
States Coast Guard cutter project case study.” Massachusetts Institute of Technology, 2010.
59
Evaluating this interface with regard our epoch sampling goals, it does present a clearly defined
pathway to every epoch, so users should easily be able to tell how epochs are defined and how to
find and select epochs based on epoch variable levels (meeting Goals #1 and #2). If all of the
nodes were expanded, the user would be able to see the size of the full epoch space, and further
hue/saturation/shading could indicate the epochs the user is able to and already has explored
(conditionally meeting Goal #3).
Figure 5-3: Example of IEEA Epoch Sampling on NGSC data implemented as a Tree
5.1.4 Treemaps and Circle Packing In the case of epoch sampling, since all sibling nodes contain a copy of the exact same
descendants, the treemap can be very repetitive and boring, as seen in Figure 5-4, which again
uses the NGCS epochs. It is hard to distinguish because of the nodes’ shared boundaries, but the
root node (the outermost rectangle) has first been divided into two parts (as the first epoch
variable in the tree, VUAV, has two values), then each of those has been divided into two parts
(the second variable also has two values), etc. down to the leaves of the tree, the epochs. All of
the epochs (the smallest division of rectangles) are easily seen at a glance, but their hierarchy, or
characteristics, are hard to distinguish. Thus, while our third goal can be easily accomplished by
different hue/saturation/shading to compactly display the fraction of all possible epochs selected,
static treemaps do not provide help to accomplish Goals #1 and #2 at all.
60
Figure 5-4: Treemap visualization of NGCS epochs
As with treemaps, circle packing easily lends to the accomplishment of Goal #3, allowing the
user to get a sense of the whole epoch space (as well as how many variables go into each epoch –
encoded by the number of circle layers), as seen in the left pane of Fig. 5-5. However, recalling
that users may encode quantitative data better in straight-line area than circles’ area, the treemap
may actually do a better job accomplishing this goal. Our particular implementation of the circle
packing visualization actually also allows users to zoom to any portion by clicking on
corresponding circles, as seen in the right pane of Fig. 5-5. Through this, a user can click through
to any particular epoch based on the variable values from higher levels, helping with Goal #2
(though not as effective as plain trees at accomplishing this goal). Finally, if for some reason a
user is very zoomed in to a particular epoch and wants to understand the variables that went into
creating it, s/he can easily zoom out layer by layer to discover them (meeting Goal #1, though a
little less efficiently than plain trees do).
61
Figure 5-5: Circle packing visualization of NGCS epochs as seen at different zoom levels
While both the treemaps and the circle packing visualizations provide the opportunity to view the
entire epoch space at a glance, it is easier to recognize levels in circle packing, as the boundaries
for rectangles overlap, whereas the boundaries of circles do not, therefore circle packing
accomplishes Goal #3 more effectively than treemaps do. It should be noted that the ability to
zoom can also be implemented on treemaps, but for the fully enumerated epoch data, as the
rectangles are all still the same size, their shared boundaries will not make this feature as useful
as it is for circle packing.
5.1.5 Evaluation Summary Table 5-1 below summarizes the relevant features of the proposed implementations from the
discussion above in the context of our IEEA epoch sampling goals. To reiterate, the evaluative
criteria are as follows:
1.
2.
3.
4.
5.
Is the visualization good for two epoch variables?
Is the visualization good for more than two epoch variables?
Does the visualization help the user understand specific epoch definitions? (Goal #1)
Does the visualization help the user find and select epochs? (Goal #2)
Does the visualization help the user understand a) epoch space size, b) fraction available
to explore, and c) fraction already explored? (Goal #3)
The three possible answers to these questions are:
62
● “Yes” – This visualization achieves the goal.
● “Fine” – This visualization is mediocre; It does not actively help nor hurt to achieve the
goal.
● “No” – This visualization hinders the achievement of or does not achieve the goal.
Criteria 2-5 are answered assuming there are greater than two epoch dimensions. The best
alternative, as reviewed for epoch sampling, for each row is underlined and highlighted.
Vis. Type:
Scatterplot
Parallel
Coords
Tree
Treemap
Circle
Packing
Good for two dims
Yes
Fine
Fine
Fine
Fine
Good for multi
dims
No
Yes
Yes
Yes
Yes
Goal #1
(understanding)
Fine
Fine
Yes
No
Yes
Goal #2 (find &
select epochs)
Fine
No
Yes
No
Yes
Goal #3 (view
epochspace/fracs)
Fine
No
Yes
Yes
Yes
Goal:
Table 5-1: Summary of characteristics for each visualization (Sec. 5.1.1-5.1.4), with best alternative for
each row underlined. “Fine” represents a visualization is passable – not helpful, but not unhelpful.
As seen, for the case of two epoch variables, the scatterplot is the best available option (note that
the scatterplot meets all goals in the two-dimensional case). For multiple dimensions, both the
tree and circle packing visualizations meet all three of our defined goals, making them both
promising alternatives on their own. However, remembering that there need not be one “best”
visualization for each purpose, as the tree visualization does a better job accomplishing goals #1
and #2 and the treemap actually does a better job with goal #3, the most promising representation
for epoch sampling among these choices seemed to be a coordinated combination of the two, in
order to optimize the abilities of both of these visualizations individually. Thus, this hybrid was
the technique we chose to implement for the Epoch Sampling submodule. The following two
sections will now describe the implementation and evaluate the usability of the resulting
interface.
63
5.2 Implementation The Epoch Sampling interface we implemented was built in Javascript, making use of the
JQuery, JQueryUI, and D3.js libraries. The start state of the interface is shown in Figure 5-6.
Figure 5-6: Start state of Epoch Sampling interface
The box on the top-left side contains instructions for the user to drag epoch variables up and
down to “reorder them in the tree below,” allowing the user the option to change the hierarchy
representation (data tree) in which the epoch data is stored. This box also contains a description
of the box on the top-right: “the fraction of the epoch space selected.” This box is subdivided
into the number of possible epochs (assuming discrete variables), and when the user selects
epochs, the sub-boxes corresponding to the selected epochs are highlighted, as seen in Figure 57, so that the user can visualize the fraction of the epoch space he or she has selected.
64
Figure 5-7: IDs of selected epochs (top) along with current state of top-right box of interface displaying
fraction of epochspace selected (12 epochs out of 108 total; bottom)
The box taking up the bottom portion of the interface displays the actual epoch tree. The
implementation of the tree follows the description in Section 5.1.3: each of the nodes is labeled
by an epoch variable and value, and following a path down any branch will lead down to a leaf
that represents an epoch taking all the values of the epoch variables prescribed by its respective
branch. Thus all leaves under a given node will represent epochs that have the value of the epoch
variable specified by that node. An example of a partially expanded tree is shown in Figure 5-8.
Figure 5-8: Partially expanded tree (using NGCS database epoch variables/values) in implemented
interface
In this implementation, clicking nodes toggles the visibility of their children/descendants. Filled
in blue circles signify that the node contains hidden children, and white circles signify that the
node has been expanded (or is a leaf and has no children). To select epochs, the user must click
into ‘SELECT mode,’ in which clicking nodes adds all descendant epochs to the list of selected
epochs, and the variables’ levels in the tree can be reordered using the drag-and-drop
functionality in the top-left box for easier mass selection, as prescribed in Section 5.1.3. Select
mode is distinguished from the default mode by background color of the bottom box; The default
mode gives the box a gray background as seen in Figures 5-6 and 5-8, whereas the select mode
gives the box a dark red background as seen in Figure 5-9.
65
Figure 5-9: The bottom box of the Epoch Sampling interface in “SELECT mode”
Finally, the “Reset Selections” button appears in both the default and select modes, and offers
the ability to get rid of all previous epoch selections.
5.3 Usability In evaluating the usability of the implementation described above, we will focus on strengths and
weaknesses with regard to the metrics described in Section 3.2 – Learnability, Efficiency, and
Error-Tolerance – paying special attention to the italicized concepts in each section as discussed
above.
5.3.1 Learnability The interface’s text immediately gives the user instructions on what the purpose of each of the
three sections of the interface is and how to manipulate them (drag and drop, click on nodes,
etc.), helping with immediate learnability. First-time users will most likely have to learn by
doing (or playing around with the interface) rather than by watching someone else manipulate it,
but as Interactive Epoch-Era Analysis is intended to be a tool for exploration, it is safe to assume
users are comfortable with exploring the interface’s functionality. It is also fairly safe to assume
that users will be goal-oriented (with the goal of selecting epochs for further IEEA purposes),
and this directs their learning to perform the selection task, rather than exploring the interface
aimlessly.
An important learnability feature of the drag-and-drop epoch variables is the allowance for
recognition of variable names rather than recall. This type of manipulation allows the user to
focus on the cognitive task of ordering epoch variables in the hierarchy rather than on conveying
knowledge to the interface and ensuring the task is carried out properly. The tree visualization
automatically updates and resets every time the variable order is changed, offering the user a
quick and visible response to his or her reordering action.
66
The interface is consistent with other web applications in that all clickable items (buttons, dragand-drop panels, tree nodes) offer the affordance of clickability by turning the mouse into a
pointer. The interface’s color scheme is also consistent with SEAri’s, offering high-contrast
differences between parts and distinguishable states.
5.3.2 Efficiency As this particular interface is geared more to provide comprehensive learnability for new users,
the efficiency for returning users is slightly compromised. Returning users have the same main
goal as new users: select epochs for further analysis. The action of selecting an epoch (or group
of epochs) through this interface requires switching modes, or an extra click. If the user later
wishes to expand another node, another click is required to switch back to the default mode
where clicking nodes expands them rather than selects. While this extra click is slightly
inefficient and can add up over time and interface usage, a returning user may also develop a
strategy to minimize the number of extra clicks: click all nodes he may wish to expand before
switching to select mode, then the only clicks necessary will be to select the epochs. This
strategy encourages all epoch variable exploration to be done in one chunk before switching
modes, separating exploration from the final selection in a user’s memory.
Common targets (switch modes button, epoch variables, nodes) all have a maximized area on
this interface to minimize pointing time. As the cursor switches to a pointer on hovering over a
clickable item, the user immediately knows when he is able to click. For the nodes, the clickable
area includes the text labeling the epoch variable and value, which is additionally bolded on
hover over to slightly increase target area.
The drag-and-drop feature helps with steering as it does not require the user to place the panel in
the exact spot in which he wishes to drop it; The ordered list will show a blank panel-sized
placeholder where the current drop target is, and on release, the panel will snap to the
placeholder position no matter where the user’s cursor is. This allows the user to overshoot the
top or bottom of the list, so he does not have to aim for these positions, and the panel will snap to
the closest position in the list to the cursor.
The drag-and-drop capability also helps with efficiency in terms of a slight shortcut. Previous
prototypes of this interface included drop-down menus for each level of the tree, requiring the
user to click and select an epoch variable for each level (in addition, making sure not to assign
the same variable to two different levels). This capability cleanly removes these inefficient extra
clicks (as well as cognitive burden on the user to keep track of which variables have already been
assigned).
5.3.3 Error-­‐Tolerance Most likely the most common error in this interface would result from the fact that clicking
nodes does two different things in the two different states: in the default mode, it expands the
node; in select mode, it selects all descendant epochs. The interface aims to mitigate this by the
67
high-contrast deep red (a variant of a color used for warnings) background in select mode, but as
noted above, the fairly common problem of human inattention, and therefore mode error,
remains.
Besides preventing errors, a key characteristic of an interface’s error-tolerance is its ability to
help a user recover when errors do occur. Clicking a node while wishing to select in the default
mode is fairly harmless: the price is just one extra click, and a user should be able to immediately
realize the mistake (due to the interface immediately expanding the node, or its responsiveness)
and easily switch to select mode to perform the task correctly. However, clicking a node while
wishing to expand in select mode is a little more problematic, as it will add epoch IDs to the list
of selected epochs and to the shaded fraction of the epoch space, though the user might not have
wished to select them. The lack of node expansion is also immediately noticeable to the user, so
the user can choose to rectify this mistake by clicking the available “Reset Selections” button.
This button currently removes all selected epochs, introducing inefficiency if the user has to
reselect all correct previous selections, so this feature could be enhanced with the ability to
individually undo any selection or group of selections.
The only other main opportunity for human error is the drag-and-drop placeholder falling
somewhere other than where the user intended for it to drop. This is a very easy fix, however,
only costing one extra mousedown and mouseup, as the user can repeat the action to drop the
panel in the correct place in the list without damaging the tree expansion (any drag-and-drop
action on the variable panels will shrink the tree back to the start state of one level either way) or
already selected epochs.
68
Chapter 6: Functionality Examination for Other IEEA Submodules In this chapter, the same kind of functionality analysis as was performed in Section 5.1 will be
performed on the remaining four of the aforementioned submodules: Era Sampling, Multi-Epoch
Analysis, Single-Era Analysis, and Multi-Era Analysis.
6.1 Era Sampling Recall from above the functionality goals for the computational and narrative-based Era
Sampling submodule, summarized here:
Automatic generation:
Goal #1: help user understand specific era definitions
Goal #2: help user find and select era(s)
Goal #3: help user understand:
a) fraction of era space available to explore,
b) fraction already explored
Manual generation:
Goal #1: help user create era by choosing constituent epochs
Goal #2: help user identify eras at a glance
Goal #3: help user understand:
a) fraction of era space available to explore,
b) fraction already explored
A selection of visualization techniques is now described for each of the era generation methods.
For each visualization presented, we evaluate its strengths and weaknesses with respect to all of
the listed goals above.
6.1.1 Parallel Coordinates As the goals for the Automatic Era Generation Sampling method are very similar to those of
Epoch Sampling, there is some overlap in visualizations presented; However, the meanings of
the structures can be interpreted very differently for epoch and era representation. The parallel
coordinate axes in this context can represent either different points in time (though convention
warns against using the x-axis as time) or different epochs that constitute an era. For the former,
the eras themselves would be represented as the lines between the axes. The distance between the
axes would be the minimum amount of time between epoch shifts for any of the eras being
displayed, and the lines would plot the state of each era at each specified point in time, bringing
out the trajectory of epochs over time. For the latter, there would be as many axes as there are
69
unique epochs, and the lines between the axes bring out transitions between epochs (and the
times at which they occur). Both of these representations, however, suffer from the same spatial
ambiguity problem that the parallel coordinates representation of epochs faced in that multiple
eras could share the same line segment, especially as automatic generation assumes all of the
possible combinations of epochs were enumerated into these eras, so line segments show more
redundancy than information. With this representation it is difficult to tell the volume of eras at a
glance, impeding all three of the automatic generation goals (and not even addressing the manual
generation goals), making parallel coordinates a poor choice with which to represent eras.
6.1.2 Sankey Diagrams Sankey diagrams, like parallel coordinates, offer the ability to show transitions between different
epochs, but have the added benefit of showing volume, alleviating some of the issues with spatial
ambiguity. With time implied on the x-axis, Sankey diagrams make it possible to get a sense of
the size of the era space and distribution of epochs and transitions, aiding with Automatic
Generation Goal #3. They illustrate the idea that eras are characterized by their constituent
epochs and transitions between them, helping to guide the user with Goal #1. Figure 6-1 shows
an example Sankey diagram representation of a set of eras. Through labeling and semitransparent flows, it is clear how many total eras there are, and how many are at which epoch
during what time. Though mass transitions are easy to spot at a glance, it is difficult to pick out
individual eras, impeding Goal #2.
70
Figure 6-1: Automatically enumerated era set represented as a Sankey diagram of epoch flows.
6.1.3 Tree Structures The sequential nature of era data appears to make it a good candidate for hierarchical
visualizations. In such an implementation, data could be organized such that all of the possible
starting epochs are children of a common root node, and a whole branch from the root down to
the leaf would represent one era, similar to an individual complete left-to-right path in a Sankey
diagram. Levels of the tree could be spaced by the minimum time between epoch shifts. By
itself, a tree representation of an era set would present a clearly defined pathway to every era, so
users can easily tell how individual eras were formed, aiding with Automatic Generation Goals
#1 and #2. An implementation utilizing expandable nodes, such as in Figure 5-3, could help
show the complete era space and eras the user is able to and already has explored (Goal #3),
though the volume of eras is again not immediately obvious because of spatial ambiguity.
Drawing from the success of the combined hierarchical Epoch Sampling visualization, an
analogous interface for Era Sampling could overlay a tree visualization with circle packing or
even a Sankey diagram to help solidify Goal #3.
71
6.1.4 Bar Chart Icons The last three visualization techniques have focused more on the Automatic Generation goals,
but the next two will now aim to focus more on Manual Generation. The most straightforward
way for a new user to think about forming an era is through the story it tells about the changing
contexts and/or needs related to a system. Once the user understands that each time one of these
contexts or needs changes signifies the start of a new epoch, it is clear that an era simply consists
of an ordered string of epochs, each with its own duration. In visualization terms, a single era can
be represented by ordered objects, each with their own value. Position along a scale is the most
distinguishable visual variable on Cleveland et al.’s list, so by ordering the objects along a single
scale, taking up lengths proportional to their durations, a single era can be represented by a sort
of bar graph. Different epochs or types of epochs can additionally be distinguished by color or
hue, as seen in Figure 6-2 (while taking care not to overuse visual variables as always). This
representation tells the user what epochs and durations make up an era at a glance, helping fulfill
Manual Generation Goal #2. An implementation that supports users to set the epochs and their
lengths will then easily accomplish Goal #1 as well.
Figure 6-2: A single era represented as a series of epochs along a single axis
Figure 6-2 simply represents one era, but if a user has to pick and distinguish between multiple
eras, the designer is faced with the task of arranging the eras in a spatial display. Earlier we
repeatedly saw that plotting on a 2D graph was prone to problems stemming from multiple
points sharing the same characteristics that determined location on the graph. Thus a display with
no such arrangement is shown in Figure 6-3, where information about an era’s “type” or
constituent epochs can be encoded in orientation and color/hue. Additional visual variables such
as shape (e.g. rounded edges, border line thickness) can encode more information about the era
depending on how much the user feels the need to distinguish between eras. Especially with
manual generation, enabling users to encode mnemonics with these additional variables to
identify epochs/eras can really help users identify the eras later on (Goal #2). This kind of
representation is also helpful in accomplishing Goals #1-2 for automatic generation as well,
though the number of eras to display is almost definitely greater than it is in manual generation,
72
so human cognitive processing power may limit this technique’s effectiveness. A bar chart icon
only represents a single era, so it is difficult for the user to understand the era space, and the
fraction that is available to be and has already been explored (Automatic and Manual Goal #3)
without further encoding. For example, as mentioned above, a different hue or shape could
represent the eras already explored and if a handful of manually constructed eras were arranged
as in Figure 6-3, the proportion of eras already explored would be fairly evident at a glance,
helping to achieve Goal #3.
Figure 6-3: An unorganized set of 7 eras (represented by bar chart icons) with hue, color, and orientation
as additional encoding.
6.1.5 Drag and Drop As introduced in Section 4.5.1, one very useful interaction technique is drag-and-drop, which
allows users to easily move objects onto specified locations. This technique can be very useful in
manual era generation to simply drag epochs (however they are represented) onto a panel that
represents a timeline in order. As jQuery also supports element resizing, once epoch elements are
displayed on the timeline, they can be stretched or squeezed to represent their respective
durations as in Figure 6-2, directly achieving Manual Generation Goal #1. An example
schematic of this idea is shown in Figure 6-4, where epochs are represented as circles with epoch
variable values explicitly stated and encoded in color. Upon release over the era grid, the epoch
circle turns into a resizable rectangle as part of the bar icon. Ideally the user would also be able
to switch around or remove these rectangles, hover to view full epoch information, and change
visual variables (e.g. hue, color, shape, orientation) of individual epochs or the entire era after
completing creation.
73
Figure 6-4: Sketch of leveraging drag-and-drop and resizing functionalities for manual era creation
6.1.6 Evaluation Summary Recall the four methods of era creation proposed by Schaffner (2014): Human-in-the-loop,
Breadth-First Search through clips, Sampling, and a Combination of these. This section has
attempted to address potential interfaces for creating eras (not the methods themselves!) with
human-in-the-loop (manual generation) as well as how to manually select eras after they have
been automatically created through any of the other methods. Table 6-1 below summarizes the
relevant features of the proposed implementations from the discussion above in the context of
our IEEA era sampling goals. To reiterate, the evaluative criteria are as follows:
1. Does the visualization help the user understand specific era definitions? (Automatic Goal
#1)
2. Does the visualization help the user find and select eras? (Automatic Goal #2)
3. Does the visualization help the user understand a) fraction available to explore, and b)
fraction already explored? (Automatic Goal #3)
4. Does the visualization help the user create an era by choosing constituent epochs and
their durations? (Manual Goal #1)
5. Does the visualization help the user identify eras he previously constructed at a glance?
(Manual Goal #2)
74
6. Does the visualization help the user understand a) fraction available to explore, and b)
fraction already explored? (Manual Goal #3)
The four possible answers to these questions are:
● “Yes” – This visualization achieves the goal.
● “Fine” – This visualization is mediocre; It does not actively help nor hurt to achieve the
goal.
● “No” – This visualization hinders the achievement of or does not achieve the goal.
● “N/A” – This visualization only addressed the automatic generation goals, so is not
applicable to evaluate on manual generation goals.
The best alternative, as reviewed for era sampling, for each row is underlined and highlighted.
Vis. Type:
Sankey
Diagram
Tree
Bar Icons
Goal:
Parallel
Coords
AG #1
No
Yes
Yes
Yes
AG #2
No
No
Yes
Yes
AG #3
No
Yes
Fine
Fine
MG #1
N/A
N/A
N/A
Yes
MG#2
N/A
N/A
N/A
Yes
MG#3
N/A
N/A
N/A
Fine
Table 6-1: Summary of characteristics for each visualization (Sec. 6.1.1-6.1.4), with best alternative for
each row underlined. “Fine” represents a visualization is passable – not helpful, but not unhelpful.
Bar icons sweep other visualizations in being more generally useful in achieving both manual
and automatic generation goals, however as the number of eras from which to select increases,
the effectiveness of the bar icon representation decreases. The size and explorable fraction of the
era space is also clearly hard to glean from the bar icons alone. It seems that for manual
generation, or a small number of eras, bar icons are the best option for representing eras, but for
the general automatic generation case, or a large number of eras, the overlaid Sankey/tree
diagram seems to be the best available option of the techniques discussed. Additionally, as the
drag-and-drop and resizing techniques do work very nicely to create the bar icon representation,
this functionality can serve as the primary tool to manipulate epochs (however they are
represented) to perform manual era creation. As bar icons seem to be the optimal representation
75
for manual era construction from the techniques presented, the “best” alternative for MG #3 is
given to bar icons with the stipulation that there must be some further visual encoding to display
which eras have data and which have already been explored. It may even be in the user’s interest
to draw from the Epoch Sampling submodule and borrow the shaded-area-visualization
capabilities of a treemap to help clarify the era space fractions.
6.2 Multi-­‐Epoch Analysis We now start examining submodules within the Analysis module. For the remainder of this
chapter, the discussions of visualizations will incorporate considerations for interactive
techniques, which would be helpful to augment the corresponding visualization. It is important to
remember as the processes in the submodules get more complex that there need not be a single
visualization that accomplishes all of the respective submodule goals (i.e. a coordinated view of
linked visualizations may be better suited to meeting goals). In his thesis, Schaffner actually
recommends a “widely-varying visualization approach… since so many aspects of the data can
be presented and studied” (Schaffner, 146)58. With this in mind, visualizations and corresponding
interaction techniques will be presented with the intention of offering specific capabilities to the
analysis as a whole.
Recall from above the functionality goals for the Multi-Epoch Analysis submodule, summarized
here:
Goal #1: help user compare designs across all epochs simultaneously
Goal #2: enforce that epochs have no order
Goal #3: allow user to explore further after performing computational analysis
A selection of visualization techniques is now described for this analysis method, focusing on
manual user interaction rather than specifying cost/utility models or performing any sort of
computation. For each visualization presented, we evaluate its strengths and weaknesses with
respect to the goals listed above.
6.2.1 Scatterplot Variants Now that the points being plotted are designs, rather than enumerated epochs or eras, spatial
ambiguity becomes less of a concern (i.e. two designs with the exact same utility and cost value
are far less common than epochs that share two context variable values), so scatterplots become a
convenient and useful visualization again. To analyze designs in a single epoch, a simple
scatterplot (or bubble chart) can easily display 2 (or 4) attributes for each design. Choosing
attributes can be accomplished in a variety of manners, from selecting from a dropdown, or
58
Schaffner, M.A., Designing Systems for Many Possible Futures: The RSC-based Method for Affordable Concept
Selection (RMACS), with Multi-Era Analysis, Master of Science Thesis, Aeronautics and Astronautics, MIT, June
2014. p. 146
76
coordinating the plot with another visualization tool. (Rhodes and Ross, 2015)59 demonstrated
the latter by pairing an up-to-4D bubble chart with a parallel coordinate plot, shown in Figure 65, which has the added benefit of bringing out patterns in design attributes if they exist.
Figure 6-5: Bubble chart paired with parallel coordinates that allow user to choose which attributes to
plot (taken from Rhodes and Ross, 2015)
One way of modifying this visualization for many epochs is to include options for metrics that
encompass design performance across all epochs (e.g. percentage of epochs in which design FPN
< 10%). Another straightforward way of extending this visualization to show the same designs in
many different epochs is to replace the bubble chart with a bubble chart matrix, with each plot
showing designs in a different epoch, keeping the x/y/size/color axis mappings the same for all
epochs. Enabling highlighting a design point in all epochs (as well as its trajectory in the parallel
coordinate plot) upon hover would easily help achieve Goal #1. If after computing the most
fuzzy Pareto optimal designs a user wishes to confirm the results through exploration, this
visualization would allow the user to narrow down the point by design variables and eyeball it
59
Rhodes D.H. and Ross A.M., Interactive Model-Centric Systems Engineering (IMCSE) Phase Two Technical
Report SERC-2015-TR-048-2; February 2015.
77
across all of the plots to ensure it is close enough to the Pareto front in each one, helping to
achieve Goal #3. Because grids are associated with having some order, human cognitive biases
might impede the success of Goal #2, but allowing the user to switch around the order of epochs
(perhaps through drag-and-drop or with a built-in ‘sort’ functionality) may alleviate this issue.
6.2.2 Epochs as Parallel Coordinates As straightforward as they are, rendering so many scatterplots for a matrix grid may slow down
response time and cause more of a cognitive burden to sort through all of the displayed
information. There is still another use for parallel coordinates in Multi-Epoch Analysis, however,
by assigning epochs to the parallel axes. Each design point is then represented as a line between
the axes and intersects axes at the value of the user-selected attribute it takes at that epoch. A
simple sketch of this concept is shown in Figure 6-6, where the user is analyzing 7 designs across
6 epochs. The intersections show the design’s Fuzzy Pareto Number during each of the epochs,
providing an easy way to eyeball the success metric and compare design performance across
epochs, helping Goal #1. This visualization also provides an easy way to achieve Goal #3. For
example, if the second-to-bottom design in Figure 6-6 is computationally recommended as
“best” because it has the lowest total FPN, the user can then explore and notice that the bottommost design actually has a lower FPN in all epochs except for Epoch 5, which may be an epoch
the user is not too concerned about in the first place, so is able to change and confirm his
decision about which design is optimal. However, because parallel coordinates seem like they
read left-to-right, the success of Goal #2 is still impeded, though again may be alleviated if
interactive functionality is added to enable the user to switch around the order of the epoch axes.
As parallel coordinates help bring out patterns, this visualization could additionally help the user
spot particularly good or bad epochs depending on the metric used on the y-axis.
Figure 6-6: Parallel coordinate plot showing Fuzzy Pareto Number of 7 designs (horizontal lines) being
plotted over 6 epochs (vertical axes)
6.2.3 Circular Extensions Both polar charts and RadViz, introduced earlier as circular extensions to parallel coordinates,
would allow users to explore designs further after computational analysis (Goal #3), and the
78
polar chart allows users to compare designs over all epochs in a more compact manner than
parallel coordinates (Goal #1). The wrap-around to circular axes is a useful transformation if
there are a lot of epochs being analyzed, to lessen the cognitive burden of processing all the
information. Some argue that circles convey a sense of a quantity being unordered (e.g. the color
wheel), but the representation of data points, especially in RadViz, depend on what axes are next
to one another. Increasing interactivity and enabling the user to readily switch around axes
(assuming they still represent different epochs) in these two plots would not only help enforce
Goal #2, but also potentially show relationships or intricacies between one or more epochs being
analyzed.
6.2.4 Evaluation Summary Table 6-2 below summarizes the relevant features of the proposed implementations from the
discussion above in the context of our IEEA Multi-Epoch Analysis goals. To reiterate, the
evaluative criteria are as follows:
1. Does the visualization help the user simultaneously compare designs in selected epochs
(and evaluate designs by the user’s choice of metric)?
2. Does the visualization help the user understand that epochs being analyzed are not
necessarily sequential?
3. Does the visualization allow the user to explore and confirm trust in computationally
recommended optimal designs post-calculation?
The three possible answers to these questions are:
● “Yes” – This visualization achieves the goal.
● “Fine” – This visualization is mediocre; It does not actively help nor hurt to achieve the
goal.
● “No” – This visualization hinders the achievement of or does not achieve the goal.
The best alternative, as reviewed for multi-epoch analysis, for each row is underlined and
highlighted.
79
Vis. Type:
Scatterplot
Matrix
Parallel [Epoch]
Coords
Polar Chart
RadViz
Compare designs
across epochs?
(Goal #1)
Yes
Yes
Yes
No
Understand epochs
are non-ordered?
(Goal #2)
Fine
Fine
Fine
Fine
Post-calculation
exploration?
Yes
Yes
Yes
Yes
Goal:
(Goal #3)
Table 6-2: Summary of characteristics for each visualization (Sec. 6.2.1-6.2.3).
Though a single best alternative was selected for Goals #1 and #3, the margin of difference
between all non-“no” alternatives is really not all that great. Because the scatterplot matrix is
able to show the most information, it is a bit more comprehensive in how much it allows a user
to explore it. For the rest of the proposed visualizations, only one value per design is able to be
plotted (e.g. FPN), whereas the scatterplot (or bubble chart) matrix allows for up to four. Though
this can be an advantage for depth of analysis, it is often much more computationally expensive
than the other alternatives, so these might be preferred for quick system response or for large
numbers of design or epoch choices.
A best alternative was not picked for Goal #2 because none of the proposed visualizations really
enforced the fact that epochs being analyzed are not sequential. However, as stated at the end of
Sections 6.2.1-6.2.3, all were deemed fine if interactivity was enabled and encouraged to let the
user readily switch around the order of epochs as they were represented, through a method such
as drag and drop or sorting.
6.3 Single-­‐Era Analysis Recall from above the functionality goals for the Single-Era Analysis submodule, summarized
here:
Goal #1: help user compare designs over the whole era through any choice of a) the same
or b) different metrics
Goal #2: enforce epochs’ specified order and help understand path-dependence
80
Goal #3: help user see design changeability
Goal #4: allow user to explore further after performing computational analysis
A selection of visualization techniques is now described for this analysis method, again focusing
on manual user interaction rather than specifying cost/utility models or performing any sort of
computation. For each visualization presented, we evaluate its strengths and weaknesses with
respect to the goals listed above.
6.3.1 Designs as Trees Before discussing visualizations for the entirety of Single-Era Analysis, it may be useful to
introduce the concept of visualizing a single design as a horizontal tree, branching out or
changing slope at whatever point in time the design is able to change to another design. Thus the
root node would represent the starting design, subsequent nodes would represent design IDs that
could be reached through all possible successive changes, and the levels would represent the
point in time (measured in actual unit of time – month, year, etc. – or simply epoch). Given a
changeability matrix (or any similar indication of what design transitions are possible), it should
be fairly straightforward to construct such a tree for any design in the matrix up to as many levels
as desired. More information in the matrix (transition cost, rules concerning in which epochs
changes are valid, etc.) can obviously lead to more accurate and informative visualizations, but
even a simple tree can give the user a sense of a design’s potential for change, aiding with
completing Goal #3.
6.3.1.1 Sankey Diagrams The above suggestion only really lets users view one design-tree changeability representation at
a time, but Sankey Diagrams offer a nice way to visualize the aggregation of all of these
changes. Recall that these diagrams are particularly good at representing flow between nodes.
The “flow” of designs changing into one another thus can be represented, highlighting the
volume of change between designs, the most common designs to end up on, and all the possible
designs that could be reached within n transitions. In his thesis, Schaffner uses D3.js to illustrate
this type of concept, as seen in Figure 6-7. In this implementation, the ribbons are color-coded
according to which design ID they started at (12, 14, or 128). From frame to frame, the
transitions they make are clearly visible, as well as the proportion of the time (out of the
generated eras) that these changes occur. This visualization can be used in conjunction with other
visualizations while conducting Single-Era Analysis to identify changeability strategies before
assuming them in other parts of the analysis.
81
Figure 6-7: Parallel sets (Sankey) visualization of designs following a changeability strategy from frame
to frame (i.e. every transition), from (Schaffner 2014). Color coded by start design, horizontal line size
“reflects the proportion of clips in which the corresponding design number appears in that frame”60
6.3.2 Line Graphs Recall that while Single-Era and Multi-Epoch Analyses both involve viewing designs in a
number of different epochs, in this submodule, the epochs have a fixed order within an era. A
straightforward way of conveying this order is through a time-series visualization, where the xaxis represents time, the y-axis represents the dependent variable, and more independent
variables can be represented through color, hue, size, shape, etc. This describes the line graphs
discussed in Section 4.1.2. In his thesis, Schaffner makes use of line graphs to show the
trajectory of the Multi-Attribute Utility and Multi-Attribute Expense of six designs over the
course of a 10-year era, shown in Figure 6-8.
60
Schaffner, M.A., Designing Systems for Many Possible Futures: The RSC-based Method for Affordable Concept
Selection (RMACS), with Multi-Era Analysis, Master of Science Thesis, Aeronautics and Astronautics, MIT, June
2014.
82
Figure 6-8: Two line graphs showing MAU (left) and MAE (right) of 6 designs over the course of a 4epoch era (Epoch 1: 3 yrs, Epoch 2: 3 yrs, Epoch 3: 2 yrs, Epoch 4: 2yrs), from (Schaffner 2014)
The fact that time is very clearly an independent variable enforces the ordered nature of the data,
accomplishing Goal #2. As demonstrated in Figure 6-8, it is possible for a user to specify any
choice of metric on the y-axis, though using different metrics for different epochs would be hard
with this type of graph, thus accomplishing Goal #1a but not #1b. While this visualization allows
the user to see the trajectory of different designs over time, it does not allow for easy
visualization of design changeability, so does not meet Goal #3, at least in this form where the
designs are simply represented by a single line. If designs were represented as trees, as described
in the previous section, and the branches were also plotted according to the era-level evaluation
metric, this would give the user a better sense of design changeability in addition to the possible
trajectory over time, aiding with Goal #3 while preserving accomplishment of the other goals
listed. Adding functionality to change the evaluation metric or designs being evaluated lets this
type of tool allow the user to explore data further to validate any sort of computational analysis,
thus possessing the ability to complete Goal #4.
6.3.3 Parallel Coordinates The concept behind visualizing this submodule as parallel coordinates is very similar to that of
line graphs: different designs are represented as horizontal lines, and the parallel axes can serve
to represent units of time (anything from month to year to epoch, though if using the latter, one
must be careful not to interpret the epochs all as necessarily having the same duration – perhaps
spacing axes a distance proportional to epoch duration apart would help in this case). The major
difference is that different scales can be used at different axes, so the user can choose different
metrics for different epochs if he wishes (though caution must clearly be taken to accommodate
this during interpretation), helping accomplish all of the goals that line graphs did (Goals #1a, 2,
4; if designs are represented as trees, also Goal #3), as well as Goal #1b.
83
6.3.4 Scatterplot Matrices Scatterplot matrices are discussed again for this submodule as they do allow for simultaneous
comparison of designs across different epochs, either with the same or different evaluation
metrics for each epoch (achieving Goal #1a and #1b). However, based on the way the matrix is
arranged, it could be hard to convey the fixedness and importance of the epoch order. For
example, Figure 6-9 depicts sketches of two potential layouts of a scatterplot matrix. The sketch
on the left does not have as clear of a sense of order (still making it perfectly suitable for MultiEpoch Analysis) as the sketch on the right, which could ostensibly help achieve Goal #2.
Figure 6-9: Two layouts of a scatterplot matrix showing a four-epoch era
If the matrix layout on the right of Figure 6-9 was used for Single-Era Analysis, as with MultiEpoch Analysis, the representation of designs would need to be clear enough that a user could
identify the same design across epochs. This visualization does not really help a user see
changeability within designs (Goal #3), but similarly to Multi-Epoch Analysis, if after computing
the most optimal designs across the era, a user wishes to confirm the results through exploration,
this visualization would allow the user to narrow down the point by design variables and eyeball
it across all of the plots to ensure it is close enough to the most optimal design in each one,
helping to achieve Goal #4.
6.3.5 Evaluation Summary Table 6-3 below summarizes the relevant features of the proposed implementations from the
discussion above in the context of our IEEA Single-Era Analysis goals. To reiterate, the
evaluative criteria are as follows:
1. Does the visualization help the user simultaneously compare designs in selected epochs
across the era (and evaluate designs by the user’s choice of metric, either a) the same or
b) different across epochs)?
2. Does the visualization help the user understand that epochs being analyzed are
necessarily sequential and understand the path-dependent effects of this order?
84
3. Does the visualization allow the user to see designs’ potential changeability?
4. Does the visualization allow the user to explore and confirm trust in computationally
recommended optimal designs post-calculation?
The four possible answers to these questions are:
● “Yes” – This visualization achieves the goal as-is.
● “Yes with Trees” – This visualization achieves the goal (only applicable for Goal #3) if
designs are represented as trees.
● “Fine” – This visualization is mediocre; It does not actively help nor hurt to achieve the
goal.
● “No” – This visualization hinders the achievement of or does not achieve the goal.
The best alternative, as reviewed for single-era analysis, for each row is underlined and
highlighted.
Vis. Type:
Line Graph
Parallel Coordinates
Scatterplot
Matrix
Compare designs
across era (same epoch
metric)? (Goal #1a)
Yes
Yes
Yes
Compare designs
across era (diff epoch
metrics)? (Goal #1b)
No
Yes
Yes
Understand epochs
ordered/path
dependencies?
Yes
Yes
Fine
Yes with Trees
Yes with Trees
No
Yes
Yes
Fine
Goal:
(Goal #2)
See designs’ potential
changeability?
(Goal #3)
Post-calculation
exploration? (Goal #4)
Table 6-3: Summary of characteristics for each visualization (Sec. 6.3.2-6.3.4).
85
From the techniques discussed here, it is evident that all three of them have different strengths,
so, in line with Schaffner’s above recommendation, it would be impractical to choose a “best”
technique. At this point in Interactive Epoch-Era Analysis, it is less useful for the user to try and
pick a “best” design over the system lifecycle, but better to explore the nuances and properties of
a handful of designs over eras of interest. Thus we conclude this section by inviting the user to
use a combination of any of the techniques (including the representation of design changeability
in a tree or Sankey diagram) to explore the data to get the additional knowledge he or she needs
out of this analysis technique.
6.4 Multi-­‐Era Analysis Recall from above the functionality goals for the Multi-Era Analysis submodule, summarized
here:
Goal #1: help user compare designs across all of the eras (through any choice of metric)
simultaneously
Goal #2: enforce that epochs in an era do have a specified order, but eras do not
Goal #3: help user see design changeability
Goal #4: help user understand and compare effects of path-dependence
Goal #5: allow user to explore further after performing computational analysis
A selection of visualization techniques is now described for this analysis method, again focusing
on manual user interaction rather than specifying cost/utility models or performing any sort of
computation. For each visualization presented, we evaluate its strengths and weaknesses with
respect to the goals listed above.
6.4.1 Line Graph Matrix As Multi-Era Analysis logically follows from Multi-Epoch and Single-Era Analyses, the first
visualization presented is a hybrid of scatterplot matrices and line graphs: a grid of line graphs.
This visualization is exactly what it sounds like, and its analysis subsequently mirrors the
previous analysis done on both of its components: line graphs, as seen in Section 6.3.2, are great
multidimensional time-series plots that clearly enforce the ordered nature of era data (Goal #2,
part 1). Putting multiple line graphs in a grid together allows the user to compare designs across
all eras, through either the same or different metrics for each era (though always the same within
an era), as well as compare the effects of path-dependence across the ordered eras (Goals #1 and
#4). An example of such a visualization is shown in Figure 6-10.
86
Figure 6-10: A selection of four eras displayed in a line graph matrix
Again, because grids are associated with having some order, human cognitive biases might
impede the success of Goal #2, part 2, but allowing the user to switch around the order of eras
may help alleviate this issue. Goal #3 can be accomplished again by representing designs in each
era as trees instead of mere lines, and adding functionality to change aspects of the visualization
in real-time lets this type of tool allow the user to explore data further to validate any sort of
computational analysis, thus possessing the ability to complete Goal #5.
6.4.2 Line Graphs to Represent One Design Nuances in path-dependence can really be brought out with the comparison of eras with small
differences, either in epoch durations, order, or a few epochs themselves. The above suggestion
for a line graph matrix offers at-a-glance comparison of multiple designs in multiple epochs. In
order to deeply compare these effects of path-dependence, it may be useful to compare one
design (and the subsequent designs it can change into) across many eras on one graph. As we
have previously seen that line graphs have been a good candidate to view time-series data, we
87
introduce the possibility of viewing the same design on one line graph, as seen in Figure 6-11.
Note that this representation only makes sense if all eras (and epochs within eras) can be
evaluated with the same metric on the y-axis (e.g. same stakeholder preferences, or same
definition of multi-attribute utility).
Figure 6-11: Line graph showing trajectories (in terms of MAU) for one design (and subsequent
changes/options) across 3 eras
As this visualization was proposed with the intention of making path-dependence effects clearer
for each individual design, it helps achieve Goal #4. The time being displayed on the x-axis
reiterates that the progression of epochs within each era are very much ordered, but the lack of
necessary order of the eras themselves does a better job conveying that the eras themselves do
not have an order than the line graph matrix, achieving Goal #2. This graph makes it very clear,
when observing just a single design, to understand where the design has change options,
potentially helping a user develop strategy for design changeability and helping achieve Goal #3.
Goal #1 is fulfilled for one design, though it becomes inefficient to use this visualization if the
user wishes to analyze several such designs at this point. For this visualization to be truly
interactive and exploratory (and help fulfill Goal #5), the user should have the option of
changing the evaluation metric and eras being compared.
6.4.3 Sankey Diagrams Sankey Diagrams can again prove useful in charting the volume and nature of changes between
designs, especially after developing a changeability strategy. Similar to the use case described in
88
Section 6.3.1.1, a Sankey Diagram can be used to represent an aggregation of all the change
paths that a design or group of designs could take over the course of an era. Ribbons could be
color-coded in a couple of different ways to bring out different features of the data: Similar to in
Single-Era Analysis, they could be colored based on start-state design, to highlight the flow of
multiple designs through one or more eras. For truly examining design flow across multiple eras,
the levels could represent units of time and ribbons could also be colored/hued based on era,
though this limits the number of designs that can be displayed meaningfully through such a
diagram.
Both of these schemes highlight design changeability, helping achieve Goal #3, but since it only
shows changes and not performance by any evaluation metric, it does not help with Goal #1.
Because the presented diagrams do not show designs’ performance, it is harder to understand the
effects path-dependence has on performance (Goal #4), even though the effects on changeability
are very apparent. Similar to line graphs, Sankey Diagrams ordered by time obviously show a
definite order to epochs within an era, while color-coding by era does not give the sense of eras
being ordered, achieving Goal #2. Sankey Diagrams have a lot of potential for allowing the user
to explore them further based on the interactivity scheme and options to filter or change around
color/level options, giving it the potential to accomplish Goal #5 adequately.
6.4.4 Evaluation Summary Table 6-4 below summarizes the relevant features of the proposed implementations from the
discussion above in the context of our IEEA Multi-Era Analysis goals. To reiterate, the
evaluative criteria are as follows:
1. Does the visualization help the user simultaneously compare designs in selected eras (and
evaluate designs by the user’s choice of metric?
2. Does the visualization help the user understand that epochs within eras being analyzed
are necessarily sequential but eras themselves are not?
3. Does the visualization allow the user to see designs’ potential changeability?
4. Does the visualization allow the user to understand the effects of path-dependence?
5. Does the visualization allow the user to explore and confirm trust in computationally
recommended optimal designs post-calculation?
The four possible answers to these questions are:
● “Yes” – This visualization achieves the goal as-is.
● “Yes with Trees” – This visualization achieves the goal (only applicable for Goal #3) if
designs are represented as trees.
● “Fine” – This visualization is mediocre; It does not actively help nor hurt to achieve the
goal.
● “No” – This visualization hinders the achievement of or does not achieve the goal.
89
The best alternative, as reviewed for multi-era analysis, for each row is underlined and
highlighted.
Vis. Type:
Line Graph
Matrix
Line Graphs for
One Design
Sankey
Diagrams
Compare designs
across eras? (Goal #1)
Yes
Fine
No
Understand epochs
ordered, eras not?
(Goal #2)
Fine
Yes
Yes
See designs’ potential
changeability?
Yes with Trees
Yes
Yes
Understand pathdependence? (Goal #4)
Yes
Yes
Fine
Post-calculation
exploration? (Goal #5)
Yes
Fine
Fine
Goal:
(Goal #3)
Table 6-4: Summary of characteristics for each visualization (Sec. 6.4.1-6.4.3).
It becomes clear, for this most complex IEEA submodule, that it is extremely difficult to come
up with one visualization that accomplishes all goals well by itself, reinforcing the point that a
combination of different visualizations is optimal for complex processes like Single- and MultiEra Analyses. All three of the visualizations were presented targeting a certain goal, so naturally
they offer different strengths, and would be worth looking into depending on what aspects of the
data a user wishes to explore further. As with Single-Era Analysis, at this point in IEEA it is
more useful for a user to be able to explore the properties of a few designs rather than picking a
best design from all of the potential options, also helping to lessen the computational load and
cognitive burden to process all designs over all eras being analyzed.
90
Chapter 7: Discussions and Conclusion Because such a vast amount of data can go into IEEA, there is an even greater number of insights
that a user can gain out of any of these submodules individually, and more importantly, in
combination. Thus, the most thorough analysis almost necessarily involves an iterative process
through the modules and submodules, refining analysis based on new results, as suggested at the
very beginning of this thesis. Consistent with this idea as well as Schaffner’s recommendation
for a “widely-varying visualization approach,” it is justified from the above discussions that
different interactive visualizations have different strengths, so using multiple in combination
and/or in sequence is superior for maximum insight gain.
As noted repeatedly, the visualizations presented in this thesis are not meant to be an exhaustive
list, nor are they presented exactly as they must be implemented. Instead, the author hopes they
will provide a foundation and spark ideas for visualizations that will be more useful and tailored
to a user’s specific Interactive Epoch-Era Analysis goals. When considering any new type of
visualization or interface, evaluating functionality and usability based on these specific goals, as
demonstrated in this thesis, can inform how helpful the software will be for a user’s specific
needs.
User-centric design necessarily needs input from real users to inform the goals of the design.
Though the user goals for each submodule in this thesis were confirmed by student researchers
who ranged from novice to expert EEA users, they could be further refined by interviewing
individuals who might actually use IEEA in a non-academic setting for real large-scale decisionmaking tasks.
In addition to more authentic user studies (and subsequent implementation/evaluation of
interfaces based on refined goals), there is plenty of room to expand on this work, both in theory
and in practice. This thesis has only discussed mouse-based interaction techniques for on-screen
visualizations, but there are many more ways to interact with and perceive data. Multimodal
interfaces make it possible to command and explore on-screen data by speech, touch, gesture,
etc. directly and through external technology. Visualizations do not even have to be constrained
to a screen, as it is possible to project data onto a 3D space for an immersive perceptive
experience (through holographic technology, for example). Lastly, the perception of data can
even happen through other senses, most notably audibly, as data sonification (conveying
information through non-speech audio) can be used instead of or in addition to visualization to
enrich the perceptive experience. All of these methods of interaction and conveying data could
have enormous potential in simplifying or further breaking down processes as complex as those
in IEEA, thus could merit further exploration.
As stated in the thesis overview, the work presented in this thesis aims to contribute to an
ongoing effort (as put forth in Curry 2015) to demonstrate that adding interactivity to interfaces
increases user satisfaction. Thus regardless of the method, hopefully it is clear in the context of
91
the visual analytics process that adding interactivity to interfaces has the potential to help users
gain much better insight from analyses than static data displays. Combining the computational
and visual display capabilities of computers with cognitive, decision-making capabilities of
humans in the loop, the IEEA process itself has much to gain from the power of visual analytics.
92
Bibliography Bostock, M. Mike Bostock. Feb 2013. <bost.ocks.org/mike>.
Cao, N. “A Survey on Multidimensional Visual Analysis Techniques.” Hong Kong University of
Science and Technology, Sept 2011.
Chan, W.W. “A Survey on Multivariate Data Visualization.” Department of Computer Science
and Engineering, Hong Kong University of Science and Technology, June 2006.
Chang, Remco. “Big Data Visual Analytics: A User-Centric Approach” [PowerPoint Slides].
Cleveland, W.S. and R. McGill, “Graphical Perception: Theory, Experimentation, and
Application to the Development of Graphical Methods,” Journal of the American
Statistical Association, 79-387, 1984.
Curry, M.D. and Ross, A.M., "Considerations for an Extended Framework for Interactive
Epoch-Era Analysis," CSER 2015.
Diller, N. P. “Utilizing Multiple Attribute Tradespace Exploration with Concurrent
Design for Creating Aerospace Systems Requirements,” Master of Science Thesis,
Aeronautics and Astronautics, Massachusetts Institute of Technology, June 2002.
Few, Stephen. Information Dashboard Design: The Effective Visual Communication of Data.
Beijing: O'Reilly, 2006.
Fitzgerald, M.E. and Ross, A.M., "Mitigating Contextual Uncertainties with Valuable
Changeability Analysis in the Multi-Epoch Domain," 6th Annual IEEE Systems
Conference, Vancouver, Canada, March 2012.
Fitzgerald, M.E. and Ross, A.M., "Sustaining Lifecycle Value: Valuable Changeability Analysis
with Era Simulation," 6th Annual IEEE Systems Conference, Vancouver, Canada, March
2012.
Fitzgerald, M.E., Ross, A.M., and Rhodes, D.H., "Assessing Uncertain Benefits: a Valuation
Approach for Strategic Changeability (VASC)," INCOSE International Symposium
2012, Rome, Italy, July 2012.
Friendly, M. “Statistical Graphics for Multivariate Data.” SAS SUGI 16 Conference, Apr 1991.
Fulcoly, D.O., Ross, A.M., and Rhodes, D.H., "Evaluating System Change Options and Timing
Using the Epoch Syncopation Framework," 10th Conference on Systems Engineering
Research, St. Louis, MO, March 2012.
Grinstein, G., Trutschl, M., and Cvek, U. “High-Dimensional Visualizations.” 7th Data Mining
93
Conference-KDD 2001.
Hoffman, P.E. “Table Visualizations: A Formal Model and Its Applications”, Doctoral
Dissertation, Computer Science Department, University of Massachusetts at Lowell,
1999.
Keeney, R. L., & Raiffa, H. (1976). Decision with multiple objectives. Wiley, New York.
Keim, D. A., Mansmann, F., Schneidewind, J., Thomas, J., & Ziegler, H. (2008). Visual
Analytics : Scope and Challenges. In Visual Data Mining (pp. 76–90).
Liu, Y. “Visualization of Multivariate Data” Department of Biomedical, Industrial and Human
Factors Engineering, Wright State University, Fall 2014.
<http://www.stat.sc.edu/~hansont/stat730/MultivariateDataVisualization.pdf>
Miller, R. 6.831 User Interface Design and Implementation, Spring 2015. (Massachusetts
Institute of Technology: MIT Stellar
<https://stellar.mit.edu/S/course/6/sp15/6.813/materials.html>).
Pina, A.L. “Applying Epoch-Era Analysis for Homeowner Selection of Distributed Generation
Power Systems,” Master of Science Thesis, Engineering and Management, Massachusetts
Institute of Technology, June 2014.
Quesenbery, W. “Balancing the 5Es: Usability,” Cutter IT Journal, 17-2, 2004.
Rader, A.A., Ross, A.M., and Rhodes, D.H., "A Methodological Comparison of Monte Carlo
Methods and Epoch-Era Analysis for System Assessment in Uncertain Environments,"
4th Annual IEEE Systems Conference, San Diego, CA, April 2010.
Rader, A.A., Ross, A.M., and Fitzgerald, M.E., "Multi-Epoch Analysis of a Satellite
Constellation to Identify Value Robust Deployment across Uncertain Futures," AIAA
Space 2014, San Diego, CA, August 2014
Rhodes D.H. and Ross A.M., Interactive Model-Centric Systems Engineering (IMCSE) Phase
One Technical Report. SERC-2014-TR-048-1; September 2014.
Rhodes D.H. and Ross A.M., Interactive Model-Centric Systems Engineering (IMCSE) Phase
Two Technical Report SERC-2015-TR-048-2; February 2015.
Ricci, N., Schaffner, M.A., Ross, A.M., Rhodes, D.H., Fitzgerald, M.E., "Exploring Stakeholder
Value Models Via Interactive Visualization," 12th Conference on Systems Engineering
Research, Redondo Beach, CA, March 2014.
Roberts, C.J., Richards, M.G., Ross, A.M., Rhodes, D.H., and Hastings, D.E., "Scenario
Planning in Dynamic Multi-Attribute Tradespace Exploration," 3rd Annual IEEE
Systems Conference, Vancouver, Canada, March 2009.
94
Ross A.M. Interactive Model-Centric Systems Engineering. 5th Annual SERC Sponsor Research
Review. Washington, DC: Georgetown University, February 2014.
Ross, A.M., and Rhodes, D.H., "Using Natural Value-centric Time Scales for Conceptualizing
System Timelines through Epoch-Era Analysis," INCOSE International Symposium
2008, Utrecht, the Netherlands, June 2008.
Ross, A.M. and Hastings, D.E., "Assessing Changeability in Aerospace Systems Architecting
and Design Using Dynamic Multi-Attribute Tradespace Exploration," AIAA Space 2006,
San Jose, CA, September 2006
Ross, A.M., “Managing Unarticulated Value: Changeability in Multi-Attribute Tradespace
Exploration,” PhD thesis, Engineering Systems Division, Massachusetts Institute of
Technology, June 2006.
Sankey Diagrams “Sankey Definitions” <http://www.sankey-diagrams.com/sankey-definitions/>
Schaffner, M.A., “Designing Systems for Many Possible Futures: The RSC-based Method for
Affordable Concept Selection (RMACS), with Multi-Era Analysis,” Master of Science
Thesis, Aeronautics and Astronautics, Massachusetts Institute of Technology, June 2014.
Schaffner, M.A., Ross, A.M., and Rhodes, D.H., "A Method for Selecting Affordable System
Concepts: A Case Application to Naval Ship Design," 12th Conference on Systems
Engineering Research, Redondo Beach, CA, March 2014.
Schofield, D.M. “A Framework and Methodology for Enhancing Operational Requirements
Development: Unites States Coast Guard Cutter Project Case Study.” Massachusetts
Institute of Technology, 2010.
Smaling, R.M. “System Architecture Analysis and Selection Under Uncertainty.” PhD thesis,
Engineering Systems Division, Massachusetts Institute of Technology, June 2005.
Spears, W.M. “An Overview of Multidimensional Visualization Techniques.” Evolutionary
Computation Visualization Workshop, 1999.
Tacca, M. C. “Commonalities between Perception and Cognition.” Frontiers in Psychology, 2:
358. PMC. 2011.
Tufte, Edward R. The Visual Display of Quantitative Information. Cheshire, Conn.: Graphics,
1983.
Various. (2010). Mastering the Information Age: Solving Problems with Visual Analytics. (D.
Keim, J. Kohlhammer, G. Ellis, & F. Mansmann, Eds.). Eurographics.
Wallace, Rosa. “Graphing Resource.” NC State University.
95
<https://www.ncsu.edu/labwrite/res/gh/gh-linegraph.html> 2004.
Wang, Y., Teoh, S.T., and Ma, K. “Evaluating the Effectiveness of Tree Visualization Systems
for Knowledge Discovery.” Eurographics/IEEE-VGTC Symposium on Visualization,
2006.
Ware, Colin. Information Visualization: Perception for Design. Elsevier, 2013.
"What Is Systems Engineering?" INCOSE. International Council on Systems Engineering, 14
June 2004. <http://www.incose.org/practice/whatissystemseng.aspx>.
Wong, P.C. and Bergeron, R.D. “30 Years of Multidimensional Multivariate Visualization.”
Department of Computer Science, University of New Hampshire, 1997.
96
Appendix Selected parts of code for implementation described in Section 5.2 (generated and modified from template on Mike
Bostock’s website):
var tree = d3.layout.tree()
.size([height, width]);
var diagonal = d3.svg.diagonal()
.projection(function(d) { return [d.y, d.x]; });
var tip = d3.tip()
.attr("class", "d3-tip")
.offset([-10,0])
.html(function(d) {return d.name;});
var svg = d3.select("#treeablediv").append("svg")
.attr("width", width+margin)
.attr("height", height+margin)
.append("g")
.attr("transform", "translate(" + margin + "," + margin + ")");
svg.call(tip);
function update(source) {
// Compute the new tree layout.
var nodes = tree.nodes(root).reverse(),
links = tree.links(nodes);
// Normalize for fixed-depth.
nodes.forEach(function(d) { d.y = d.depth * 180; });
// Update the nodes…
var node = svg.selectAll("g.node")
.data(nodes, function(d) { return d.id || (d.id = ++i); });
// Enter any new nodes at the parent's previous position.
var nodeEnter = node.enter().append("g")
.attr("class", "node")
.attr("transform", function(d) { return "translate(" + source.y0 + "," + source.x0 + ")"; })
.on("click", click); ////////// MODE DEPENDENT
//.on("dblclick", function(d) {if (!(d in selections)) {selections.push(d);}
// newclick();});
nodeEnter.append("circle")
.attr("r", 1e-6)
.style("fill", function(d) { return d._children ? "lightsteelblue" : "#fff"; }); /////// FILL
nodeEnter.append("text")
.attr("x", function(d) { return d.children || d._children ? -10 : 10; })
.attr("dy", ".35em")
.attr("text-anchor", function(d) { return d.children || d._children ? "end" : "start"; })
.text(function(d) { return d.name; })
.style("fill-opacity", 1e-6);
// Transition nodes to their new position.
var nodeUpdate = node.transition()
.duration(duration)
.attr("transform", function(d) { return "translate(" + d.y + "," + d.x + ")"; });
nodeUpdate.select("circle")
.attr("r", 4.5)
.style("fill", function(d) { return d._children ? "lightsteelblue" : "#fff"; });
nodeUpdate.select("text")
.style("fill-opacity", 1);
// Transition exiting nodes to the parent's new position.
var nodeExit = node.exit().transition()
.duration(duration)
97
.attr("transform", function(d) { return "translate(" + source.y + "," + source.x + ")"; })
.remove();
nodeExit.select("circle")
.attr("r", 1e-6);
nodeExit.select("text")
.style("fill-opacity", 1e-6);
// Update the links…
var link = svg.selectAll("path.link")
.data(links, function(d) { return d.target.id; });
// Enter any new links at the parent's previous position.
link.enter().insert("path", "g")
.attr("class", "link")
.attr("d", function(d) {
var o = {x: source.x0, y: source.y0};
return diagonal({source: o, target: o});
});
// Transition links to their new position.
link.transition()
.duration(duration)
.attr("d", diagonal);
// Transition exiting nodes to the parent's new position.
link.exit().transition()
.duration(duration)
.attr("d", function(d) {
var o = {x: source.x, y: source.y};
return diagonal({source: o, target: o});
})
.remove();
// Stash the old positions for transition.
nodes.forEach(function(d) {
d.x0 = d.x;
d.y0 = d.y;
});
}
function switchmode() {
SELECT_MODE = !SELECT_MODE;
if (!SELECT_MODE) { //in Normal Mode
document.getElementById("selectbutton").innerHTML = "Switch to SELECT mode";
document.getElementById("treeablediv").style.backgroundColor = "#EAEAEA";
document.getElementById("treeablediv").style.color = "black";
svg.selectAll("g.node").style("fill", "black");
svg.selectAll(".link").style("stroke", "#444");
document.getElementById("intro").innerHTML = "</br></br>Click on the nodes to expand them.";
} else { //in Select Mode
document.getElementById("selectbutton").innerHTML = "IN SELECT MODE </br> Click to switch back";
document.getElementById("treeablediv").style.backgroundColor = "#800007";
document.getElementById("treeablediv").style.color = "white";
svg.selectAll("g.node").style("fill", "white");
svg.selectAll(".link").style("stroke", "#DDD");
document.getElementById("intro").innerHTML = "</br></br>Click on the nodes to select all descendant epochs.";
}
}
function click(d) {
if (!SELECT_MODE) { //in NORMAL mode
if (d.children) {
d._children = d.children;
d.children = null;
98
} else {
d.children = d._children;
d._children = null;
}
update(d);
} else { //in SELECT MODE
selections = [];
listleaves(d);
for (a in selections) {
if (typeof selections[a]=="object") {
if (!(uniqueEpochs.contains(selections[a].name))) {
console.log(selections[a].name);
$("#"+selections[a].name).css({"background-color":"#800007", "color":"white"});
uniqueEpochs.push(selections[a].name);
SELECTED_EPOCHS.push(selections[a]);
}}
}
uniqueEpochs.sort(function(c,b) {return parseInt(c.slice(1))-parseInt(b.slice(1))});
document.getElementById("selectedepochs").innerHTML = "Selected Epochs: "+uniqueEpochs+
' <button type="button" onclick="resetselections()">Reset Selections</button>';
}
}
99