Visual Analytic Techniques for Structural and Functional Discovery in Dynamic Graphs

advertisement
1
Visual Analytic Techniques for Structural and
Functional Discovery in Dynamic Graphs
Shawn Mankad, George Michailidis
Abstract—Time series of graphs are increasingly prevalent in modern data and pose unique challenges to visual exploration and
pattern extraction. This paper describes the application of matrix factorizations that enhance existing visualization techniques for
exploration and pattern detection in dynamic graph series. The combination of matrix factorization and visualizations allows the user to
hone in on and display interesting, underlying structure and its evolution over time. The methods are scalable to data sets with a large
number of time points or nodes, and can accommodate sudden changes to graph topology. The contribution of our techniques to visual
exploration are demonstrated with several dynamic graph series from both synthetic and real world data. The real graphs are obtained
from citation and trade networks. These examples illustrate how users can steer the techniques and augment existing visualization
methods to discover and display meaningful patterns in sizable graphs over many time points.
Index Terms—Visual analytics, dynamic graph filtering, structural graph discovery, graph visualization, matrix decomposition
F
1
I NTRODUCTION
V
ISUAL analytics have become an important class of
methods for extracting information from complex
data. The combination of sophisticated analytical and
visualization techniques allows users to explore and
detect interesting patterns in large amounts of data,
and develop a deeper understanding of the underlying
mechanisms.
Due to advances in data collection technologies, time
series of graphs (networks) are increasingly common in
a variety of fields, such as sociology [1], biology [2],
and finance [3], among others. The analysis of such
data is challenging, since time dependent changes may
simultaneously affect network topology and node/edge
features.
Common visualization techniques for dynamic graphs
enhance static methods with animations that move
nodes as little as possible between time steps to facilitate
readability. However, the effectiveness of these methods
rely on the human ability to perceive and remember
changes [4]. Moreover, experiments have discovered that
the effectiveness of dynamic layouts are strongly predicted by node speed and target separation [5]. Thus, dynamic graph visualizations encounter difficulties when
faced with a large number of time points, larger graphs
that feature abrupt, non-smooth changes, or if the user
is interested in detailed analysis, especially at the individual node level (see Section 3.2 of [6], [7], [8]).
In this work, we present methods that address these
challenges by utilizing visualizations that leverage matrix factorization to detect and display underlying struc-
• Shawn Mankad is with the Department of Statistics, University of
Michgian, Ann Arbor, MI, 48109.
E-mail: smankad@umich.edu
• George Michailidis is with the Departments of Statistics and Electrical
Engineering & Computer Science, University of Michigan, Ann Arbor,
MI, 48109.
ture. In particular, we are interested in primarily three
tasks that necessitate an approach with mathematical
foundations. (i) Produce static displays of the evolving
node connectivity, while explicitly incorporating the temporal dimension. This can be especially useful when one
is given a large number of time points, or if the graph
sequence contains sudden changes. (ii) Time-varying,
overlapping community assignment, which could be
used for node coloring, aggregation, and so on to aid
in visual analysis. (iii) Reduce clutter in graph layouts
in a principled fashion by removing unimportant edges
via a filter. This preprocessing can help facilitate visual
exploration and information extraction.
The proposed methodology relies on examining how
lower dimensional matrix representations evolve, and
controlling their evolution using a constrained optimization. The constraint strengths, which control how
sensitive the matrix representations are to short term
fluctuations, are set by the user to steer the analysis.
The methodology is scalable to data sets with large
numbers of time points or nodes, and can accommodate
abrupt changes that are challenging to dynamic layouts.
Visual exploration of dynamic graphs based on matrix
factorizations is an important approach to consider for
many reasons. First, factorizations allow the user to see
typical time-varying node behaviors. Some of these timevarying connectivity patterns may be expected, as in the
rapid rise in connectivity of hub nodes in preferential attachment networks, but others may be truly discovered.
As a consequence, matrix factorizations also allow the
user to see a measure of the dynamic network’s complexity, specifically, the number and types of evolving
nodes in the data.
The remainder of this article is organized as follows:
in the next section, we review related work on visualization of graphs. We provide background on matrix
factorizations in Section 3, followed by a description of
2
our approach (Section 4). Mathematical detail behind
the proposed factorization is given in Section 5. The
corresponding procedure to obtain estimates and are
discussed in Section 6. We then discuss different displays
based on matrix factorizations and exemplify our approach on several simulated and real-world data sets in
Sections 7, 8, and close the article with a brief discussion
(Section 9).
2
R ELATED W ORK
V ISUALIZATION
ON
N ETWORK A NALYSIS
AND
2.1
Static Graphs
There is a large literature focusing on discovering patterns within and among a set of static graphs, that is,
graphs that are not changing over time. Traditional examples of network analysis include discovering community structure, and using different connectivity measures,
such as path length, degree, modularity, and so on, to
characterize the relative importance of particular subsets
of nodes [9], [10], [11]. These traditional tools can experience difficulties with a sequence of networks, as they
do not explicitly incorporate any temporal information.
Yet, this is a key aspect, as time dependent changes may
simultaneously affect network topology and node/edge
features.
Due to the challenges of analyzing such complex objects, graph layouts have become important for detecting
meaningful structure. [12], [13] are classical texts that
discuss traditional graph layouts, which impose criteria such as display symmetry, minimal edge crossings,
uniform edge lengths and so on for aesthetic reasons. A
number of software packages have been developed for
graph drawing of complex networks (see, for example,
[14], [15], [16], [17], [18]).
More recent graph visualization techniques have been
developed for graphs on the order of hundreds of thousands nodes, and depend on the stage and goal of the
analysis, and attributes of the given graphs (e.g., static
vs. dynamic, weighted vs. unweighted, etc.) [6], [7]. For
example, an overview of large graphs can be performed
efficiently using aggregation techniques to avoid drawing every node [19]. For social networks, graph drawing
techniques based on semantic and structural abstraction
can be used [20], and methods to visualize uncertainty
have been developed [21], [22]. Other approaches similar
in spirit to this work have been created for community assignment in large, static sparse graphs through
computation of eigenvectors of graph related matrices
[23]. These communities could be used for aggregation,
node coloring, etc., to facilitate visual analysis. While
we follow the same principles as these works, we offer
analysis for a time-series of graphs.
sequence. For example, [24] propose an online algorithm
that uses the Graphics Processing Unit to efficiently
represent the main global structure of the graphs while
preserving the layout’s temporal stability. An important challenge for such an approach is to preserve the
user’s mental map [4], [25], [26]. Specifically, the same
overall shape and attributes should be preserved and
nodes moved as little as possible between time steps to
facilitate readability. An alternative approach is called
the small multiples approach [27], which allows the
user to view all time periods simultaneously using a
matrix of images. Numerous studies have shown the
small multiples approach to be superior for some graph
comprehension tasks [4], [28].
However, analyzing even a hundred time points can
be challenging for either approach if the user is interested in detailed analysis. [5] find that the effectiveness of animation is strongly predicted by node speed
and target separation. Thus, there exists a bottleneck
stemming from the user’s cognitive load, as the user
must remember patterns over a large time span or time
points must be traversed quickly increasing node speed.
With small multiples, there usually exists a scarcity of
screen space for so many images. Both approaches can
also struggle with larger graphs that feature non-smooth
changes to its topology.
In this work, we address the challenge of scalability
both in terms of time and network size by representing
the network sequence with a set of time-series for each
node. The paths of each node over all time points can
be displayed using static visualizations. Hence, this work
is particularly suited for representing temporal changes
at the individual node level, a task that can be difficult
using graph drawing techniques [8].
Another recent approach, the so-called TimeMatrix [8],
relies on viewing adjacency matrices at different levels of
aggregation to help users gain insight about the temporal
aspects of network sequences. Since the main display is
of an adjacency matrix, TimeMatrix performs well when
especially interested in evolution of edges. In contrast to
TimeMatrix, our approach focuses on node dynamics.
Both approaches complement graph drawing through
matrix-based representations, and are designed for analytical tasks that may be difficult with pure drawing
approaches.
3
In this section, we provide background information on
two matrix factorizations and how they have been utilized to facilitate successful application of visualization
techniques. The extension to dynamic networks is presented in Section 4.
3.1
2.2
Dynamic Graphs
When given time-varying graphs, the majority of algorithms are designed for animated drawing of the graph
BACKGROUND
Matrix Factorizations
The most common factorization is the Singular Value
Decomposition (SVD), which has important connections
to community detection [29], graph drawing [30], and
3
areas of statistics and signal processing [31]. For instance
in classical spectral layout, the coordinates of each node
are given by the SVD of graph related matrices, and can
be calculated efficiently using tools in [30], [32].
The non-negative matrix factorization (NMF) is an
alternative factorization that has been shown to be advantageous for visualization of non-negative data [33],
[34], [35]. This is typically the case with networks, as
edges commonly correspond to flows, capacity, or binary
relationships, and hence are non-negative. Recently, theoretical connections between NMF and important problems in data mining have been developed [36], [37], and
accordingly, NMF has been proposed for overlapping
community detection on static networks [38], [39].
Both types of factorization approximate a given graph
related matrix with an outer product
A ≈ UV T ,
(1)
where A is usually the n × n adjacency or Laplacian
matrix, and U and V are both n × K matrices. The rank
or dimension of the approximation K is chosen to obtain
a good fit to the data while achieving interpretability.
The key difference between SVD and NMF are the
constraints that are placed on U and V . SVD imposes
a particular geometry on the factorization, so that U
and V can each be viewed as coordinate systems that fit
the data. In particular, each (eigen)vector Ui is perpendicular to every other vector Uj , so that the collection
{U1 , ..., UK } forms a lower dimensional orthonormal
space that the data can be visualized in. In addition, U
satisfies U T U = I (orthonormality constraints). A similar
characterization holds for {V1 , ..., VK }. In contrast, with
NMF the orthogonality constraints are replaced with a
restriction of non-negativity of the factorized matrices
[33], [40]. That is, every element of U and V is greater
than or equal to zero. The geometric characterization
of SVD is traded for the enhanced interpretability that
comes from strictly additive combinations. For instance,
PK
consider (1) in element form Aij = k=1 Uik Vkj . Each
term of the sum can be thought of as the contribution
of community k to edge Aij , since all terms are nonnegative.
Both factorizations are useful for discovering interesting node connectivities in networks. The U vectors
score nodes by their “interestingness”, or distance from
the average outgoing connectivity. The V vectors yield
similar scores based on incoming connections. Together,
U and V are useful for highlighting nodes by their
importance to interconnectivity.
For illustration, consider the graph structures and
matrix factorizations shown in Fig. 1. The U vector
for both NMF and SVD on the Star Network highlight
the central node, having the largest magnitude. The V
vectors show that all peripheral nodes are equal in terms
of their incoming connections, and that the central node
has no incoming connections.
An interesting fact about NMF is that the estimates
are always rescalable (scale invariant). For example with
Star Network

Adjacency
Matrix





NMF
U, V

SVD
U, V





0
0
0
0
0

1 1 1 1
0 0 0 0
0 0 0 0
0 0 0 0
0 
0 0 
0
0
2
 0   0.5 



 0   0.5 



 0   0.5 
0   0.5
0
0.9
 −0.5
−0.2 


−0.2 
  −0.5
−0.2   −0.5
−0.5
−0.2
Ring Network


0
 1

 0

 0
1
1
0
1
0
0
0.9
 0.9

 0.9

 0.9
 0.9
−0.6
 0.5

 −0.2

 −0.2
0.5











0
1
0
1
0












0 1
0 0 

1 0 

0 1 
1 0
0.4
0.4 

0.4 

0.4 
0.4 
−0.6
0.5 

−0.2 

−0.2 
0.5
Fig. 1. Rank 1 non-negative and singular value matrix
factorizations. The SVD is computed with the Laplacian
matrix.
the Star Network, we can multiply
to obtain



1

 0 





U =  0 ,V = 


 0 
0
U by 0.5 and V by 2
0
1
1
1
1



.


(2)
Their product U V T is unchanged with the rescaling.
Thus when interpreting NMF estimates, one cannot compare the magnitudes of U with V . Instead, U and V
should be considered separately, with emphasis on the
relative distributions of scores.
For instance, NMF vectors of the Ring Network show
each node with an equal score for incoming and outgoing connectivity. The fact that U contains larger elements
than V is arbitrary. However, the assignment of equal
values within U and V shows each node is equally
important to interconnectivity.
The rescaling issue does not apply to SVD due to
the imposed orthonormality constraints. Yet, the SVD
magnitudes tend to fluctuate with the Ring Network and
are harder to interpret given the network structure.
With noise and time-varying evolution, connectivity
patterns can be hidden from traditional approaches. Our
approach utilizes matrix factorizations with additional
constraints to filter or smooth out the noise, and provide
a basis for visualization of node evolution and graph
structure. Before moving to our proposed approach, we
provide some background for matrix factorizations with
additional constraints.
4
3.2
Penalized Matrix Factorizations
The use of additional constraints in matrix factorizations
is a common technique to more fully reveal structure
within the data. We refer to this class of models as penalized matrix factorizations, since usually the constraints
are represented as penalties using the Lagrangian form
of an objective function.
In penalized matrix decompositions, the factorized
matrices are obtained through minimizing an objective
function that consists of a goodness of fit component and
a roughness penalty. The strength of the penalty is set by
the user, where a larger penalty encourages smoother U
and V . For instance, [41], [42], [43] add penalties for SVD
on certain types of data. These penalties effectively relax
the rigid orthogonality constraints of SVD, thus allowing
the factorization to better represent the particular data
structures. Penalties with NMF are also common (see
[44], [45], [46], [47], [48] and references therein).
These previous works usually consider a static setting,
that is, applying factorization to a single matrix. We use
penalties as a way to extend matrix factorization to a
dynamic sequence of graphs. Thus, our problem poses
additional challenges because we observe a sequence of
adjacency matrices, and does not directly fit into existing
approaches due to either the time series component or
multiple, correlated nodes at each time point.
In the next sections, we present the additional constraints to extend matrix factorizations for graph sequences, develop the constraints using a Langrangian
penalty in an optimization, and provide estimation algorithms.
4
OVERVIEW
OF
P ROPOSED A PPROACH
Given a time series of networks {Gt = (Vt , Et )}Tt=1
with corresponding adjacency matrices {At }Tt=1 , the goal
is to produce a sequence of lower dimensional matrix
factorizations.
To enhance their visualization and interpretability,
we impose certain constraints on the factorizations. In
particular, the constraints aim to satisfy the following
properties
1) The evolving basis Ut should exhibit temporal stability to preserve the mental map.
2) Nodes that are known to be similar at a particular
time should be close together in Ut .
3) Insignificant nodes should be set exactly to zero in
Vt to enhance interpretability.
The first property aims to preserve the “mental map”
in displays of Ut . This is a fundamental concern, as the
effectiveness of such representations rely on the human
ability to perceive and remember changes [4]. Node
trajectories are visually smooth when this property is satisfied. As a consequence, time plots of each node become
informative. The user can see typical time-varying node
behaviors, and the number and types of behaviors in the
data. The time plots form a set of static displays that
explicitly incorporate the temporal dimension. These
aspects help with detailed analysis and avoid difficulties
with animated or dynamic graph layouts when given a
large number of time points or nodes.
Where the first property deals with temporal structure,
the second deals with ‘spatial’ structure. In fact, the
corresponding constraint encourages nodes in the same
group or “cluster” to evolve together. Thus, this property
is useful when incorporating prior knowledge of node
groups at different points in time. If such information is
unknown a priori, then this property/constraint can be
omitted.
The third property deals with the time-varying factors
(Vt ). Setting unimportant nodes to zero improves overall interpretability of visuals and facilitates analysis by
identifying important nodes at different points in time.
A heatmap or display of nonzero status of each element
is appropriate, due to the penalty form on Vt .
In addition to the displays of Ut and Vt , matrix
factorization procedures can be useful for community
detection and filtration of the observed graphs. With any
estimation of Ut and Vt , a reconstruction of the given
graph is also obtained by taking the product Ut VtT . With
large and complex graphs, the filtered versions highlight
important relations and reduce clutter in subsequent
visualizations by selectively removing edges.
4.1
Illustrative Example
Before discussing mathematical and procedural details, we illustrate a main benefit of the proposed approach with simulated data. We consider a sequence of
weighted, directed random graphs, with two embedded
groups whose intergroup connections are time-varying.
Weighted directional networks with evolving groups are
of interest in diverse areas, including economic, technology, and biological networks.
Fig. 2 illustrates an important aspect of the simulation,
namely that intergroup connections evolve according to
the following functional shapes
f1 (t)
∝
f2 (t)
∝
(t − 25)
p
1 + (t − 25)2
I{t > 33}.
(3)
(4)
Nodes belonging to the first group connect to each other
with weights following a sigmoid (growth) curve, which
may be conceptually similar to countries in trading
networks that experience persistent and rapid increases
in trade. This group is composed of nodes 80 to 100.
Nodes belonging to the second group trade with each
other at a stable level only after a particular time. This
type of pattern could be observed with citation networks,
when papers (nodes) enter into the network, and quickly
reach their maximal number of connections. This group
is composed of nodes 10 to 20. All other edges exist
independently with a fixed probability, with average
weights held fixed. There are 100 time points and 100
nodes in each time period.
We use this data to compare four models:
5
the data are more clearly represented and match the
true functions governing the data generation mechanism
best. The displays combine to show typical time-varying
node behaviors, the number and types of node evolution
in the data, and when node groups become active.
5
O PTIMIZATION P ROBLEM
Returning to our given time series of adjacency matrices
{At }Tt=1 , the first component of the proposed objective
function measures goodness of fit:
Fig. 2. Schematic of the illustrative example. Edges
exist independently with some probability, with average
weights given by the dashed line. The two within group
connections have average weights that follow particular
curves over time.
1) The direct NMF applies classical NMF to each data
slice separately, without any additional smoothness
constraints.
2) The penalized NMF applies NMF with our proposed additional constraints.
3) The direct SVD applies classical SVD to each data
slice separately.
4) The penalized SVD applies SVD with our proposed additional constraints.
For each model, we fit a sequence of one dimensional
factorizations to facilitate visualization. The parameters that control the temporal and sparsity constraint
strengths are searched over a grid of values. Displays
for each set of parameters are made, with the one that
emphasizes the structure most shown in Fig. 3. This
strategy is feasible, as the estimation procedure is computationally efficient. For each estimate, we reorganize
{Ût } into multidimensional time series


(Û1 )1 (Û1 )2 . . . (Û1 )n
 (Û2 )1 (Û2 )2 . . . (Û2 )n 

,
(5)
 ...

(ÛT )1 (ÛT )2 . . . (ÛT )n
where each row corresponds to a time point and columns
index nodes. Larger rank factorizations would result
in multiple time-series for every node, one for each
dimension. The same organizational scheme is used for
{V̂t }.
Time series for the node positions in Ût and V̂t are
shown in Fig. 3. For all model specifications, one could
identify the three node groups from the time plots.
However, the penalized NMF is most representative of
the actual functions the groups follow. Similarly, the
heatmap for the penalized factorizations are the most
satisfactory, as the sparsity patterns complement the time
plots by identifying when groups become distinguishable.
Altogether, the smoothed NMF displays appear most
informative, as the main structural patterns underlying
min
T
X
{Ut ,Vt }T
t=1
||At − Ut VtT ||2F .
(6)
t=1
After translating the properties above into constraints,
we write them as penalties through Lagrange multipliers
to facilitate estimation. Thus, the factorized matrices are
obtained through minimizing an objective function that
consists of a goodness of fit component and roughness
penalties. The final, proposed objective function becomes
min
{Ut ,Vt }T
t=1
T
X
||At − Ut VtT ||2F
(7)
t=1
W
+ λt
t+ 2
T
X
X
||Ut − Ut̃ ||2F
t=1 t̃=t− W
2
+ λg
T X
n
X
X
||Ut (i, :) − Ut (j, :)||22
t=1 i=1 j∈N (i)
+ λs
T X
n X
K
X
|Vt (i, j)|,
t=1 i=1 j=1
where W is a small integer representing a time window
and N (k) denotes the neighborhood or group that node
k belongs to. The parameters λt , λg , λs and W are all
non-negative numbers set by the user to steer the analysis. In many applications it is appropriate to use W = 2
(looking one time period ahead and before) and λg = 0,
so that only λt and λs need to be defined.
The first penalty term controls for short term fluctuations in the evolving basis Ut . Hence, the visual effect of
setting larger λt is to create smoother paths over time for
each node in Ut . Larger penalization levels force nodes
to have similar positions as in neighboring time steps.
In fact, if λt is set to an extremely large number, then Ut
will be approximately constant for all time periods, e.g.,
Ut = Ut . This relation provides a useful guideline and
upper bound when setting the penalty level. As shown
in Fig. 5 in the supplemental material, if λt is too large,
the trajectories overlap and exhibit little variation due to
over-smoothing. λt could also be set to zero, but then the
time plots become difficult to interpret due to temporal
instabilities.
λg corresponds to the second (group) property in the
previous section that nodes in the same group should
evolve similarly. The actual penalty is similar in spirit
to the first penalty term. It controls the fluctuations of
6
Direct NMF
Penalized NMF
Direct SVD
Penalized SVD
Ût
V̂t
Fig. 3. Estimates for the illustrative example under different model specifications. Each line (trajectory) corresponds
to a node.
the group within the basis Ut at each point in time.
Without prior knowledge of group structure, λg is set to
zero so that the constraint is optional. Otherwise if such
structure is known, larger values of λg more strongly
encourage groups to evolve similarly in Ut .
λs corresponds to the third property that unimportant
nodes should be set exactly to zero in Vt . Appropriate λs
tends to emphasize the main patterns in the data. Very
large λs can result in numerical instability and degenerate solutions, e.g., force all values to zero. As a general
guideline, small-to-moderate amounts of sparsity seem
to improve interpretability. As a rough guideline, we
find setting λs five to ten times smaller than the size of
λt yields interpretable displays. This will be discussed
further and demonstrated in the applications.
The parameter, W , controls the window width for
smoothing, e.g., the number of neighboring time steps to
average over. Larger values of W mean that the model
has more memory so it incorporates more time points
for estimation. This risks missing sharper changes in
the data and only detecting the most persistent patterns.
On the other hand, small values of W make the fitting
more sensitive to sharp changes, but increase short term
fluctuations due to smaller number of observations. We
set W = 2 (looking one time period ahead and before)
for all presented case studies. Larger values could be
used in very noisy settings to further smooth results.
analysis. Additional details are provided in the supplemental material on the optimization and corresponding
algorithm for SVD.
Below we give the algorithm accommodating temporal and sparsity penalties only, e.g., without the group
penalty. Though the final algorithm with a group penalty
is almost identical to the one presented below, there is
some added algebraic complexity stemming from potentially arbitrary group structure. We avoid this difficulty
by writing the temporal and group penalties with a
Laplacian smoothing matrix. Details are provided in the
supplemental material.
The benchmark algorithm for NMF was proposed by
Lee and Seung [33], [40], and is known as ‘multiplicative
updating’. The algorithm can be viewed as an adaptive
gradient descent, and was shown to find local minima of the objective function. It is relatively simple to
implement, but can converge slowly due to its linear
convergence rate [49]. In practice we find that after a
handful of iterations, the algorithm results in visually
meaningful factorizations.
The multiplicative updating algorithm is shown in
Algo. 1. The updating rules are derived from standard
arguments. Details are again given in the supplemental
material.
7
V ISUALIZING E VOLVING N ODE C ONNECTIV-
ITIES
6
O BTAINING E STIMATES
In this section, we focus on the algorithmic aspects of
NMF, since we find it is preferable to SVD for visual
After estimating the matrix factors, we reorganize each
dimension of {Ut } and {Vt } into multidimensional time
series, as shown in (5). Time plots and heatmaps (or
7
Algorithm 1 Algorithm for penalized NMF with temporal and sparsity constraints
1: Set constants λt , λs , W .
2: Initialize {Ut }, {Vt } as dense, positive random matrices.
3: repeat
4:
for t=1..T do
5:
Set
W
(At Vt +λt
(Ut )ij ← (Ut )ij
6:
7:
8:
Set
Pt+ 2
Ut̃ +λt t̃=t+1
t̃=t− W
2
(Ut VtT Vt +W λt Ut )ij
Pt−1
Ut̃ )ij
.
(a) No penalties
(ATt Ut )ij
(Vt )ij ← (Vt )ij
.
(Vt UtT Ut )ij + λs
end for
until Convergence
displays of non-zero entries) for each dimension of Ut
and Vt , respectively, are generated.
From the time plots of Ut , one can see typical timevarying node trajectories, and the number and types of
nodes in the data. Vt are useful for identifying when
particular nodes or groups of nodes become important
from a connectivity perspective. In particular, the penalty
on Vt drives nodes to zero exactly when unimportant.
As a consequence, we visualize sparsity pattern with
heatmaps.
When the factorization displays are combined, the
user can discover potential groups that evolve similarly
(Ut ) and whether the trajectories are important (Vt ). This
provides analysts a way to uncover dynamic structure
different from typical dense clumps on the network.
Moreover, it provides an exploratory view of all time
points and node connectivities simultaneously in a set
of static displays.
7.1
Case Studies
7.1.1 Preferential Attachment Process
In this simulation, we observe 100 noisy snapshots of a
preferential attachment graph as it forms. Nodes attach
according to a preferential attachment model until 10000
nodes have ’attached’ to the embedding. We observe this
growing process at 100 uniformly spaced time points.
Thus, at each time point 100 new nodes attach to the
graph. We use source code from a networks MATLAB toolbox [50] that generates preferential attachment
graphs according to the standard model [10], [51].
In the preferential attachment model, Π(i), which represents the probability that a new node connects to node
i, depends on node i’s degree. Specifically, we have
Π(i) ∝ di
(8)
where di is the degree of the ith node. This generating
framework leads to large networks whose degree distribution follows a power-law distribution with parameter
γ = 3. Graphs with heavy-tailed degree distributions are
(b) λt = 50, λs = 5
(c) λt = 100, λs = 5
Fig. 4. Fitted values for U and V over time for the preferential attachment embedding. The left column shows a
time plot of Ut over different parameter values. Each line
corresponds to a node on the graph. The right column
identifies the nonzero elements of Vt . Each row corresponds to a node on the graph and time varies along the
horizontal axis.
commonly observed in a variety of areas, such as the
Internet, protein interactions, citation networks, among
others [52].
Given appropriate levels of penalization, NMF yields
interpretable decompositions with just a sequence of
one-dimensional (K = 1) approximations. In onedimensional space, important nodes have distinct trajectories that indicate their relative importance to the
network.
In preferential attachment, nodes that acquire more
connections will increase their degree at a higher rate as
time goes on. This consequence of the generating process
8
is reflected in the estimated Ut , shown in Fig. 4. We
can see many nodes with trajectories near zero, or with
trajectories that increase at a slow rate. It is difficult to
discern by eye which of these curves are important. The
displays of Vt that indicate nonzero elements convey this
information. We show the non-zero status to denote the
on-off relationship clearly to the user, since some nodes
have values very close, but not exactly equal to zero. The
connectivity pattern, such as attachment order, is clearly
conveyed in the pseudo upper triangular form. Rows
(nodes) are ordered according to their sums.
Fig. 4 shows the estimated factorizations for three different levels of penalization. Without a temporal penalty,
the time plots emphasize only the most dominant, highest degree node. Setting non-zero λt smooths the node
trajectories and highlights important nodes as other
vertices attach to it. Sparsity is then important to keep
displays of Vt uncluttered and hence interpretable.
7.1.2 arXiv Citations
We investigate a time series of citation networks provided as part of the 2003 KDD Cup [53]. The graphs
are from the e-print service arXiv for the ‘high energy
physics theory’ section.
The data covers papers in the period from October
1993 to December 2002, and is organized into monthly
networks. In particular, if paper i cites paper j, then
the graph contains a directed edge from i to j. Any
citations to or from papers outside the dataset are not
included. We also choose to aggregate edges, that is, the
citation graph for a given month will contain all citations
from the beginning of the data up to, and including,
the current month. Altogether, there are 22750 papers
(nodes) with 176602 edges over 112 months. Statistical
properties of the data were discussed in [54], which
found that the networks feature decreasing diameter
over time and heavy-tailed degree distributions.
Fig. 5 compares the direct and penalized factorizations
using a sequence of one-dimensional approximations
(an inner rank of one, K = 1). As observed with the
preferential attachment experiments, the time plots are
difficult to read without penalties. The paper trajectories are smoothed effectively and the important papers
are highlighted by employing penalties. Moreover, the
displays of Vt show behavior similar to the preferential
attachment setting. A main difference is that papers do
not appear uniformly throughout time. They ‘attach’ at
a faster rate around year 2000. Again, the displays show
the non-zero status to denote the on-off relationship
more clearly to the user, as some nodes have values very
close, but not exactly equal to zero.
As the penalized fits show, there are two important
periods in the data. The first period covers 1996-1999,
and featured papers mostly on an extension of string
theory called M-theory. M-theory was first proposed in
1995 and led to new research in theoretical physics. A
number of scientists, including Witten, Sen, Polchinski,
and Duff, were important to the historical development
of the theory, and as seen in Table 1, our NMF approach
identifies these important authors and their works. From
1999-2000 the citations to these papers decreased, while
focus shifted to other topics and subfields that M-theory
gave rise to. The top papers from year 2000 and after
also include review papers on M-theory, signaling the
maturity of the topic. Tables 1 and 2 show the top 10
papers in each time period.
Once again, we have provided a simple workflow that
allows the user to visually uncover the patterns in the
data. We first fit the penalized, rank 1 NMF for each
graph. We display the estimated components, and from
this are able to identify the key papers and individuals
that contributed to high energy and theoretical physics.
8 V ISUALIZING C OMMUNITY D ETECTION
E MBEDDED S TRUCTURE
AND
Next, we discuss interpretations and visualizations that
can be facilitated using matrix factorizations as a community detection and filtering mechanism.
8.1
Community Detection
In addition to visualizing evolving connectivity with Ut
and Vt , matrix factorization procedures are useful for
overlapping community detection. In particular, we use
the smooth Ut to group nodes resulting in a community
structure that is temporally stable. The rank (K) of Ut
corresponds to the number of communities in the graph
sequence. The contribution of each community to node i
is measured by the relative magnitude of the i-th element
PK
of each dimension of Ut , e.g., (Ut )ik / k=1 (Ut )ik .
In the following case studies, we utilize a small extension to traditional graph drawing that effectively
translates the overlapping community structure to the
user. Specifically, we color each node according to the
relative contribution of each community with a pie chart.
In principle, one can also use Vt to measure relative
community contribution. Though, this can sometimes
be visually unsatisfactory due to transient community
assignments. A previous work in a static setting [38],
PK
used the full product (Ut )ik (Vt )jk / k=1 (Ut )ik (Vt )jk to
measure the relative contribution of each community
to each (At )ij edge. First, all edges are assigned to the
community with largest relative contribution. Then, for
the given node, the pie-chart displays the proportion of
its edges that belong to each community. However, this
can also be visually unsatisfactory due to unstable community assignments. Using the smooth Ut and defining
communities in terms of nodes ensures the stability of
the community structure through time.
8.2 Discovering and Visualizing Embedded Structure
The matrix factorizations also serve as a filter through
the product Ut VtT . The filtered graphs, which selectively
remove unimportant edges, can help highlight important
9
No Penalty
λs = 1
λs = 5
λs = 20
Fig. 5. Fitted values for Ut and Vt for the arXiv data with λt = 100. Each light gray line corresponds to a paper (node)
on the graph. The bold lines show the average of the 10 papers with highest average Û from 1996-1999, and 2001
onwards (dashed). Each row in the heatmaps corresponds to a paper and time varies along the horizontal axis.
TABLE 1
The top 10 papers with highest average Û from 1996-1999. # citations counts all references to the work, including by
papers outside of our data. These counts obtained via Google.
Title
Evidence for F-Theory
Notes on D-Branes
Harmonic superpositions of M-branes
Comments on String Dynamics in Six Dimensions
Strings on Orientifolds
M-Theory (the Theory Formerly Known as Strings)
An Introduction to Non-perturbative String Theory
BPS Quantization of the Five-Brane
Black Holes and Solitons in String Theory
BPS Spectrum of the Five-Brane and Black Hole Entropy
Authors
Vafa
Polchinski, et. al
Tseytlin
Seiberg and Witten
Dabholkar and Park
Duff
Sen
Dijkgraaf, et. al
Youm
Dijkgraaf, et. al
In-Degree
135
75
54
44
39
35
34
33
29
28
Out-Degree
4
7
4
2
2
3
21
3
24
4
# citations (Google)
993
556
374
285
195
283
189
156
163
181
TABLE 2
The top 10 papers with highest average Û from 2001 onwards.
Title
Supergravity and a Confining Gauge Theory: Duality Cascades
and χSB-Resolution of Naked Singularities
The String Dual of a Confining Four-Dimensional Gauge Theory
Gravity Duals of Supersymmetric SU(N) x SU(N+M) Gauge Theories
Curvature Singularities: the Good, the Bad, and the Naked
M(atrix) Theory: Matrix Quantum Mechanics as a Fundamental Theory
TASI Lectures: Introduction to the AdS/CFT Correspondence
Anatomy of Two Holographic Renormalization Group Flows
D3-brane Holography
The Holographic Renormalization Group
Strings, Branes and Extra Dimensions
Authors
Klebanov and Strassler
In-Degree
197
Out-Degree
46
# citations (Google)
1271
Polchinski and Strassler
Klebanov and Tseytlin
Gubser
Taylor
Klebanov
Bianchi, et. al
Danielsson, et. al
Boer
Forste
179
112
102
60
25
21
12
6
5
83
40
70
274
74
55
69
46
302
435
453
211
232
137
54
26
34
56
10
node relations when used in combination with network
visualization tools.
The different penalty settings are used to control how
edges are removed from the original graph. For instance,
with a very large temporal penalty, Ut becomes effectively constant. This forces the reconstruction to reflect
an “average” community structure from the full time
period. There are additional issues to consider when
primarily interested in filtering that we highlight next.
Unweighted graphs feature adjacency matrices that
are defined by the location of zeros. However, Algo.
1 results in reconstructions Ât = Ut VtT that commonly
contain elements close to, but not exactly equal to zero.
Hence, if graph filtering is the goal, one must also set
a small threshold to define edges. In particular, if the
estimated value (Â)ij is less than the threshold, then
(Â)ij is set to zero. After this thresholding step, the
filtered graph can be fed into traditional graph drawing
tools.
Since undirected graphs feature symmetric adjacency
matrices, it may seem natural to impose symmetry upon
our matrix factorization:
At ≈ Ut UtT .
Case Studies
Below two new case studies are presented. The arXiv
citation data is also discussed in the supplemental material.
8.3.1
●
●
Fig. 6. The cell phone network from a day using a force
directed layout algorithm in igraph. Node 200 is colored
black. A filtered version of this graph is shown in Fig. 7.
(9)
We find that symmetric NMF is far more sensitive to
penalization than its general counterpart. It exemplifies
less flexibility, since any additional constraint strongly
influences the overall accuracy of the estimation. On
the other hand, with general matrix factorization, as Vt
changes, Ut compensates in order for the final product to
reproduce the data as best as possible. For completeness,
the updating rules for the symmetric matrix factorizations are provided in the supplemental material.
8.3
●
●
●
●
● ●
●
● ●
●
●
● ●
●
● ●● ● ●
●● ● ● ●
●
● ●●● ●● ●
●
●
●
●
●
●●
●
●● ●● ●●● ●
● ●● ●● ●
●●● ●
●●●●●● ● ● ●●
●
●
●● ●●
●● ●●
●
●
●
●
● ● ● ● ● ●● ● ●
●
●
●
●
● ●● ●
●
●
●
● ● ●● ● ● ●
●
●
●●
● ●
● ●●● ● ● ●●● ●● ● ●● ●● ● ●
●●
●● ● ● ● ● ●
● ●
●●●
●●●● ●● ● ●●●●●● ●
● ● ●●●
●
●
●
● ●●● ●●
●● ● ●●
●● ● ●
●●●
● ●
● ● ●●● ●●
●● ●
●●●●●●●
●
● ● ● ● ●● ●● ●
●
●
●● ● ●
● ●
●● ●
●●
● ●● ● ●
●●●●
●
●● ●●● ●
●
●●
●●
● ●
●●
●● ● ●● ● ● ●
●
●
●
●
●
●
●
● ● ●
● ●
●●
● ●
●
●
●
● ● ● ● ●● ● ● ●●●● ●● ● ●●
●
● ● ● ●● ● ●
●
●● ●
● ●● ●
●
●
● ● ●
●
●●● ●●● ●● ●
●●●
●
●
●
●
● ● ●● ●● ●●
●
● ●
●●
●
●
● ● ●●
●● ●●● ●●●●
● ● ● ●●
●
●
●
● ●● ●●
●
● ●
●
● ● ● ●
●
●
●
●
●
●
●
●
●
Catalano Communication Network
We demonstrate the value of the proposed approach as
a data filter by analyzing the Catalano social network,
which was part of the VAST 2008 challenge [55]. The
synthetic data consists of 400 unique cell phone IDs
over a ten day period. Altogether, there are 9834 phone
records with the following fields: calling phone identifier, receiving phone identifier, date, time of day, call
duration, and cell tower closest to the call origin. The
purpose of the challenge was to characterize the social
structure over time for a fictitious, controversial sociopolitical movement. In particular, the challenge requires
identifying five key individuals that organize activities
and communications for the network; a hint was given
to challenge participants that node 200 is one of the
persons of interest. We use the first seven days of data to
illustrate our methodology, since there is a strong change
in the connection patterns from day 8-10 for node 200
(see [55], [56] and references therein).
Directed networks were constructed daily by drawing
an edge from the caller to the receiver. Fig. 6 shows an
example of one day’s network. The graph is too cluttered
to visually identify leaders of the network or get a sense
of the network structure.
We apply the penalized NMF to filter the networks
and highlight the main structural patterns. Specifically,
we use a large temporal penalty to keep only the most
persistent interactions while removing transient communications. Thus, the temporal aspect of the data is
utilized through the penalization to discover important
interactions. The reconstructed network is shown in Fig.
7. The persons of interest and the hierarchical structure
of the communication network are clearly shown. Node
200 relays information to his neighbors (1,2,3,5), and
each neighbor disseminates information to his respective
subordinates.
The pie chart on each node displays the contribution of
each community measured with Ut . Fig. 7 shows nodes
higher up on the social hierarchy tend to belong to multiple communities, presumably since they disseminate
information to different groups of subordinates.
The filter was run using the following parameter
settings: K = 7, W = 2, λt = 10, λg = λs = 0, and a
threshold of 0.2 to discretize the reconstruction. Different
values of K, λt and the post processing threshold were
evaluated with a grid search. Parameter values were
chosen to emphasize readability and interpretability of
the filtered graph embeddings.
We find that the key persons of interest and the hierarchical structure are robust to the number of communities
(inner rank, K) of the reconstruction. However, the com-
11
3
5
200
1
2
Fig. 7. The filtered network using Fig. 6 as input. A
force directed layout in igraph was used to create this
embedding. Nodes are colored by soft partitioning via the
penalized NMF.
munity assignments are not as interpretable with smaller
K. Additional details are given in the supplemental
material.
Since we apply the matrix factorization as a filter, accuracy in the reconstruction is important. We set λs = 0, as
sparsity can cause the reconstruction to miss important
edges.
The results could be sharpened further by utilizing
the additional information, such as geographic location
and call duration. This could be done in our modeling
approach by defining groups and employing a group
penalty.
VAST never officially released correct answers for
the challenge. However, our analysis closely matches
winning entries [56], [57], [58]. Treating the conclusions
of the entries as ground truth, we have provided a
simple workflow that uncovers the patterns in the data.
By plotting the filtered graphs, we are able to correctly
identify the key individuals that organize activities and
communications for the network. Moreover, the use
of matrix factorizations as a community detection tool
helps the graph layout communicate the organizational
structure underlying the socio-political movement.
8.3.2 Global Trade Flows
In this example, the data consists of annual, total bilateral
trade flows between 164 countries from 1980-1997 [59].
Thus, we observe a dynamic, weighted graph at 18 time
points, where each directional edge denotes the total
value of exports from one country to another. Since trade
flows can differ in size by orders of magnitude, we work
with trade values that are expressed in log dollars.
Fig. 7 in the Supplemental material shows examples
of the data. The representations are too dense to convey
much information. In fact, a typical analysis primarily relies on non-visual approaches. Aggregate statistics, such
as import and export totals or measures of centrality
are commonly used to identify important countries and
flows [60].
Our approach additionally utilizes the time aspect
of connections to uncover interesting patterns in the
data that can be successfully visualized using traditional techniques. Specifically, we reduce clutter in the
displays by removing unimportant flows, and produce
time-varying clustering structure that groups countries
according to their trading activities. Even with no drawn
edges, these time varying communities visually convey
trading partners and geopolitical alignment.
Fig. 8 displays the trade flows after applying penalized
NMF, and there are a number of interesting insights that
become apparent. First, Europe and Japan seem to be
trading hubs that connect other countries, such as the
United States and Russia. The coloring scheme, shown
without edges in the second row, denotes overlapping
community structure. We use three underlying communities (K = 3), and let each be colored as red, blue
or green. The particular color for a country is given
by the relative contribution of the three communities.
For example, the US and western Europe are bright
green, denoting that they belong only to one group. In
contrast, Russia is bright red indicating it belongs to
another group. Countries like China belong to multiple
communities, and the composition changes over time.
We can roughly interpret the green as Western economies
and red as the communist economies, led by the former
Soviet Union, which fell in 1989. A year after the Soviet Union fell, China, India and a number of Persian
Gulf and middle Eastern countries that were partially
aligned with the Soviet Union, began to trade more
with Europe and the US. This behavior continued so
that by 1997 green is the dominant color, even though
in 1980 the green and red are roughly equal. The third
group denoted by blue tends to consist of countries that
experienced average or lower growth rates in the data.
It primarily consists of African and Central American
countries, and is relatively stable.
The number of communities is chosen according to
cross validation. This type of approach helps balance
model complexity (number of communities) and accuracy. We find that an NMF model with K = 3 is
best, since additional communities do not significantly
improve the reconstruction or interpretability of the
clustering. Details on cross validation are given in the
supplemental material.
The third row plots the networks using a force based
graph drawing algorithm. These layouts provide additional insights. The United States’ importance to global
trade increased over time and is more distinct, becoming
the most dominant node on the graph by 1990. European countries tend to form strong communities due
12
to heightened inter-continental trade. However, by the
1990’s there is an additional subcommunity, comprised
mostly of the so-called ’Asian miracles’: countries in
east Asia that experienced persistent and rapid economic
growth in the 1990’s [61], [62]. Lastly, the homogeneity of
country colors in the graphs support the notion that the
world trade network has become more interconnected.
Thus, this analysis may foreshadow the benefits and
potential risks to the global economy stemming from
the modern day Euro crisis, in which multiple European
countries may fail if a single country fails.
9
D ISCUSSION
The main idea behind the approach presented in this
paper is to abstract the network sequence to a sequence
of lower dimensional spaces using matrix factorizations.
Next, we highlight some of the strengths and weaknesses
of this approach.
9.1
Strengths
An important benefit is the versatility and scalability
of matrix factorization. Table 1 in the supplemental
material shows Algo. 1 run times for all case studies.
The computational speed is fast enough to use as a
preprocessing step before applying traditional visualization tools. Moreover, it is reasonable to select parameters
(λt , λs , K) with a grid search or cross validation.
Using the factorizations as a basis for an exploratory
visual tool can help users uncover different connectivity
patterns and evolution in the data. The displays of Ut
and Vt can lead to group identification or a ranking of
nodes on their importance to connectivity. They also give
the user a sense of the data complexity by the types and
numbers of trajectories.
Matrix factorizations are useful for community detection and selectively removing edges, which both facilitate visualizations of graphs. The use of temporal penalties allow the user to control how sensitive the recovered
structure is to short term fluctuations. Smoothing out
noise and highlighting the main temporal trends can
facilitate other matrix-based representations, such as [8],
for visualization of edge dynamics.
9.2
Weaknesses
The optimal choice of tuning parameters (λt , λs ) is dependent on perception and how the edge weights are
scaled. Thus, the user will likely need to experiment with
different parameters each time a sequence of networks
is encountered. This can limit the benefits of the proposed approach when given a large multiple of network
sequences.
Time plots and heatmaps to visualize each factor
yield limited information about the underlying global
topology. For example, one can see from Figs. 4 and 5
that there are dominant nodes, but in principle, there
could be many topologies that feature dominant nodes.
One cannot say for sure without additional analysis that
the networks have an emergent hub structure or follow a
particular model. Conveying global geometric structure
is traditionally an area where graph drawing excels.
Thus, combining the matrix factorization approach in
this article with existing visualization tools can provide
a more comprehensive view of the data.
9.3
Future Work
An important area of exploration would be to systematically compare penalized versions of NMF and SVD.
In this work we chose to focus on NMF, since we
find the corresponding displays preferable in terms of
interpretability. This is generally consistent with existing literature on matrix factorization. However, SVD of
graph related matrices have deep connections to classical
spectral layout and problems in community detection.
There may be classes of graph topologies and particular
visualization goals under which SVD is preferable.
There could also be other types and combinations of
penalties that are useful in visualization and detection of
graph structure. In this work, we are primarily interested
visualizing the evolution of node connectivity and extracting stable structure when given dynamic networks
with short term fluctuations. If one is interested in evolution at the group or edge level, or in detecting particular
topologies, other penalties may be more beneficial.
R EFERENCES
[1]
B. Skyrms and R. Pemantle, “A dynamic model of social
network formation,” Proceedings of the National Academy of
Sciences, vol. 97, no. 16, pp. 9340–9346, 2000. [Online]. Available:
http://www.pnas.org/content/97/16/9340.abstract
[2] R. Opgen-Rhein and K. Strimmer, “Inferring gene dependency
networks from genomic longitudinal data: a functional data approach,” REVSTAT, vol. 4, no. 1, pp. 53–65, 2006.
[3] L. Adamic, C. Brunetti, J. H. Harris, and A. A. Kirilenko, “Trading
Networks,” SSRN eLibrary, 2010.
[4] D. Archambault, H. Purchase, and B. Pinaud, “Animation, small
multiples, and the effect of mental map preservation in dynamic
graphs,” Visualization and Computer Graphics, IEEE Transactions on,
vol. 17, no. 4, pp. 539 –552, april 2011.
[5] S. Ghani, N. Elmqvist, and J.-S. Yi, “Perception of animated nodelink diagrams for dynamic graphs,” Computer Graphics Forum
(Proc. EuroViz 2012), vol. 31, no. 3, pp. 1205–1214, 2012.
[6] T. von Landserber, A. Kuijper, T. Schreck, J. Kohlhammer, J.-J.
van Wijk, and D.-W. Fellner, “Visual analysis of large graphs,”
EuroGraphics state of the art reports, 2010.
[7] T. von Landesberger, A. Kuijper, T. Schreck, J. Kohlhammer,
J. van Wijk, J.-D. Fekete, and D. Fellner, “Visual analysis of large
graphs: State-of-the-art and future research challenges,” Computer
Graphics Forum, vol. 30, no. 6, pp. 1719–1749, 2011. [Online].
Available: http://dx.doi.org/10.1111/j.1467-8659.2011.01898.x
[8] J. S. Yi, N. Elmqvist, and S. Lee, “Timematrix: Analyzing temporal
social networks using interactive matrix-based visualizations,”
International Journal of Human-Computer Interaction, vol. 26, no. 1112, pp. 1031–1051, 2010.
[9] M. Newman, Networks: An Introduction. Oxford University Press,
2010.
[10] M. Newman, A. Barabási, and D. Watts, The Structure And Dynamics of Networks, ser. Princeton Studies in Complexity. Princeton
University Press, 2006.
[11] E. Bullmore and O. Sporns, “Complex brain networks: Graph
theoretic analysis of structural and functional systems,” Nature
Reviews, vol. 10, pp. 186–198, 2009.
13
1980
1990
1997
Fig. 8. World maps over time, where countries are colored corresponding to their NMF factorization, with three
communities. The particular color mixture indicates how much each component contributes to the country. The first
row shows the most important reconstructed edges.
[12] M. Kaufmann and D. Wagner, Drawing Graphs: Methods and
Models, ser. Lecture Notes in Computer Science. Springer, 2001.
[13] G. Di Battista, Graph drawing: algorithms for the visualization of
graphs, ser. An Alan R. Apt book. Prentice Hall, 1999.
[14] D. Auber, “Tulip a huge graph visualization framework,”
in Graph Drawing Software, ser. Mathematics and Visualization,
M. Jnger and P. Mutzel, Eds. Springer Berlin Heidelberg, 2004,
pp. 105–126.
[15] M. Bastian, S. Heymann, and M. Jacomy, “Gephi: An open
source software for exploring and manipulating networks,”
2009. [Online]. Available: http://www.aaai.org/ocs/index.php/
ICWSM/09/paper/view/154
[16] D. Auber, D. Archambault, R. Bourqui, A. Lambert, M. Mathiaut,
P. Mary, M. Delest, J. Dubois, and G. Mélançon, “The Tulip
3 Framework: A Scalable Software Library for Information
Visualization Applications Based on Relational Data,” INRIA,
Research Report RR-7860, Jan. 2012. [Online]. Available: http:
//hal.inria.fr/hal-00659880
[17] W. De Nooy, A. Mrvar, and V. Batagelj, Exploratory Social Network
Analysis With Pajek, ser. Structural Analysis in the Social Sciences.
Cambridge University Press, 2011.
[18] U. Brandes and D. Wagner, “Visone – analysis and visualization of
social networks,” in GRAPH DRAWING SOFTWARE. SpringerVerlag, 2003, pp. 321–340.
[19] N. Elmqvist and J.-D. Fekete, “Hierarchical aggregation for information visualization: Overview, techniques, and design guidelines,” Visualization and Computer Graphics, IEEE Transactions on,
vol. 16, no. 3, pp. 439 –454, may-june 2010.
[20] Z. Shen, K.-L. Ma, and T. Eliassi-Rad, “Visual analysis of large
heterogeneous social networks by semantic and structural abstraction,” Visualization and Computer Graphics, IEEE Transactions
on, vol. 12, no. 6, pp. 1427 –1439, nov.-dec. 2006.
[21] C. Correa, Y.-H. Chan, and K.-L. Ma, “A framework for
uncertainty-aware visual analytics,” in Visual Analytics Science and
Technology, 2009. VAST 2009. IEEE Symposium on, oct. 2009, pp. 51
–58.
[22] C. Correa, T. Crnovrsanin, and K.-L. Ma, “Visual reasoning about
social networks using centrality sensitivity,” Visualization and Computer Graphics, IEEE Transactions on, vol. 18, no. 1, pp. 106 –120,
jan. 2012.
[23] I. Dhillon, Y. Guan, and B. Kulis, “Weighted graph cuts without
eigenvectors a multilevel approach,” Pattern Analysis and Machine
Intelligence, IEEE Transactions on, vol. 29, no. 11, pp. 1944 –1957,
nov. 2007.
[24] Y. Frishman and A. Tal, “Online dynamic graph drawing,” Visualization and Computer Graphics, IEEE Transactions on, vol. 14, no. 4,
pp. 727 –740, july-aug. 2008.
[25] H. Purchase and A. Samra, “Extremes are better: Investigating
mental map preservation in dynamic graphs,” in Diagrammatic
Representation and Inference, ser. Lecture Notes in Computer Science, G. Stapleton, J. Howse, and J. Lee, Eds. Springer Berlin /
Heidelberg, 2008, vol. 5223, pp. 60–73.
[26] P. Saffrey and H. Purchase, “The ”mental map” versus ”static
aesthetic” compromise in dynamic graphs: a user study,” in
Proceedings of the ninth conference on Australasian user interface Volume 76, ser. AUIC ’08.
Darlinghurst, Australia, Australia:
Australian Computer Society, Inc., 2008, pp. 85–93.
[27] E. Tufte, Envisioning information. Graphics Press, 1990.
[28] M. Farrugia and A. Quigley, “Effective temporal graph layout: A
comparative study of animation versus static display methods,”
Information Visualization, vol. 10, no. 1, pp. 47–64, 2011.
[29] F. R. K. Chung, Spectral Graph Theory. Amer. Math. Soc., 1997.
[30] Y. Koren, “Drawing graphs by eigenvectors: theory and practice,”
Computers & Mathematics with Applications, vol. 49, no. 1112, pp.
14
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
1867 – 1888, 2005. [Online]. Available: http://www.sciencedirect.
com/science/article/pii/S089812210500204X
T. Hastie, R. Tibshirani, and J. H. Friedman, The elements of
statistical learning: data mining, inference, and prediction: with 200
full-color illustrations. New York: Springer-Verlag, 2001.
U. Brandes, D. Fleischer, and T. Puppe, “Dynamic spectral layout
of small worlds,” pp. 25–36, 2006, 10.1007/11618058 3. [Online].
Available: http://gdea.informatik.uni-koeln.de/677/
D. D. Lee and H. S. Seung, “Learning the parts of objects by nonnegative matrix factorization,” Nature, vol. 401, pp. 788–791, 10
1999.
P. Paatero and U. Tapper, “Positive matrix factorization: A nonnegative factor model with optimal utilization of error estimates
of data values,” Environmetrics, vol. 5, no. 2, pp. 111–126, 1994.
[Online]. Available: http://dx.doi.org/10.1002/env.3170050203
K. Devarajan, “Nonnegative matrix factorization: An analytical
and interpretive tool in computational biology,” PLoS Comput
Biol, vol. 4, no. 7, p. e1000029, 07 2008. [Online]. Available:
http://dx.doi.org/10.1371%2Fjournal.pcbi.1000029
C. Ding, X. He, and H. D. Simon, “On the equivalence of
nonnegative matrix factorization and spectral clustering,” in Proc.
SIAM Data Mining Conf, 2005, pp. 606–610.
C. Ding, T. Li, and W. Peng, “On the equivalence between nonnegative matrix factorization and probabilistic latent semantic
indexing,” Comput. Stat. Data Anal., vol. 52, no. 8, pp. 3913–3927,
Apr. 2008. [Online]. Available: http://dx.doi.org/10.1016/j.csda.
2008.01.011
I. Psorakis, S. Roberts, M. Ebden, and B. Sheldon, “Overlapping
community detection using bayesian non-negative matrix
factorization,” Phys. Rev. E, vol. 83, p. 066114, Jun 2011. [Online].
Available: http://link.aps.org/doi/10.1103/PhysRevE.83.066114
F. Wang, T. Li, X. Wang, S. Zhu, and C. Ding, “Community
discovery using nonnegative matrix factorization,” Data Min.
Knowl. Discov., vol. 22, pp. 493–521, May 2011. [Online]. Available:
http://dx.doi.org/10.1007/s10618-010-0181-y
D. D. Lee and H. S. Seung, “Algorithms for non-negative matrix
factorization,” Advances in neural information processing systems, pp.
556–562, 2001.
H. Zou, T. Hastie, and R. Tibshirani, “Sparse principal component
analysis,” Journal of Computational and Graphical Statistics, vol. 15,
no. 2, pp. 265–286, 2006.
D. M. Witten, R. Tibshirani, and T. Hastie, “A penalized
matrix decomposition, with applications to sparse principal
components and canonical correlation analysis,” Biostatistics,
vol. 10, no. 3, pp. 515–534, 2009. [Online]. Available: http:
//biostatistics.oxfordjournals.org/content/10/3/515.abstract
J. Guo, G. James, E. Levina, G. Michailidis, and J. Zhu, “Principal
component analysis with sparse fused loadings,” Journal of
Computational and Graphical Statistics, vol. 19, no. 4, pp. 930–946,
2010. [Online]. Available: http://pubs.amstat.org/doi/abs/10.
1198/jcgs.2010.08127
M. W. Berry, M. Browne, A. N. Langville, V. P. Pauca, and
R. J. Plemmons, “Algorithms and applications for approximate
nonnegative matrix factorization,” in Computational Statistics and
Data Analysis, 2006, pp. 155–173.
Z. Chen and A. Cichocki, “Nonnegative matrix factorization with
temporal smoothness and/or spatial decorrelation constraints,” in
Laboratory for Advanced Brain Signal Processing, RIKEN, Tech. Rep,
2005.
P. O. Hoyer, “Non-negative sparse coding,” in In Neural Networks
for Signal Processing XII (Proc. IEEE Workshop on Neural Networks
for Signal Processing), 2002, pp. 557–565.
——, “Non-negative matrix factorization with sparseness
constraints,” J. Mach. Learn. Res., vol. 5, pp. 1457–1469, December
2004. [Online]. Available: http://portal.acm.org/citation.cfm?id=
1005332.1044709
D. Cai, X. He, J. Han, and T. Huang, “Graph regularized nonnegative matrix factorization for data representation,” Pattern Analysis
and Machine Intelligence, IEEE Transactions on, vol. 33, no. 8, pp.
1548 –1560, aug. 2011.
M. Chu, F. Diele, R. Plemmons, and S. Ragni, “Optimality, computation, and interpretation of nonnegative matrix factorizations,”
SIAM JOURNAL ON MATRIX ANALYSIS, pp. 4–8030, 2004.
G. Bounova, “Matlab tools for network analysis,” dec. 2011.
[Online]. Available: http://strategic.mit.edu/downloads.php
A.-L. Barabsi and R. Albert, “Emergence of scaling in random
networks,” Science, vol. 286, no. 5439, pp. 509–512, 1999.
[52]
[53]
[54]
[55]
[56]
[57]
[58]
[59]
[60]
[61]
[62]
[Online]. Available: http://www.sciencemag.org/content/286/
5439/509.abstract
A. Clauset, C. Rohilla Shalizi, and M. E. J. Newman, “Power-law
distributions in empirical data,” ArXiv e-prints, Jun. 2007.
J. Gehrke, P. Ginsparg, and J. M. Kleinberg, “Overview of the
2003 kdd cup,” in SIGKDD Explorations, vol. 5, no. 2, 2003, pp.
149 –151.
J. Leskovec, J. Kleinberg, and C. Faloutsos, “Graphs over
time: densification laws, shrinking diameters and possible
explanations,” in Proceedings of the eleventh ACM SIGKDD
international conference on Knowledge discovery in data mining,
ser. KDD ’05. New York, NY, USA: ACM, 2005, pp. 177–187.
[Online]. Available: http://doi.acm.org/10.1145/1081870.1081893
Proceedings of the IEEE Symposium on Visual Analytics Science and
Technology, IEEE VAST 2008, Columbus, Ohio, USA, 19-24 October
2008. IEEE, 2008.
A. A. Shaverdian, H. Zhou, G. Michailidis, and H. V. Jagadish,
“Algebraic visual analysis: the catalano phone call data set
case study,” in Proceedings of the ACM SIGKDD Workshop on
Visual Analytics and Knowledge Discovery: Integrating Automated
Analysis with Interactive Exploration, ser. VAKD ’09. New
York, NY, USA: ACM, 2009, pp. 74–82. [Online]. Available:
http://doi.acm.org/10.1145/1562849.1562858
Z. Shen and K.-L. Ma, “Mobivis: A visualization system for exploring mobile data,” in Visualization Symposium, 2008. PacificVIS
’08. IEEE Pacific, march 2008, pp. 175 –182.
Q. Ye, B. Wu, D. Hu, and B. Wang, “Exploring temporal egocentric
networks in mobile call graphs,” in Fuzzy Systems and Knowledge
Discovery, 2009. FSKD ’09. Sixth International Conference on, vol. 2,
aug. 2009, pp. 413 –417.
R. C. Feenstra, R. E. Lipsey, H. Deng, A. C. Ma, and H. Mo, “World
trade flows: 1962:2000,” NBER Working Paper no. 11040, 2004.
L. De Benedictis and L. Tajoli, “The world trade network,” The
World Economy, vol. 34, no. 8, pp. 1417–1454, 2011. [Online].
Available: http://dx.doi.org/10.1111/j.1467-9701.2011.01360.x
J. E. Stiglitz, “Some lessons from the east asian miracle,” The
World Bank Research Observer, vol. 11, no. 2, pp. 151–177, 1996.
[Online]. Available: http://wbro.oxfordjournals.org/content/11/
2/151.abstract
R. R. Nelson and H. Pack, “The asian miracle and modern
growth theory,” The World Bank, Policy Research Working Paper
Series 1881, Feb. 1998. [Online]. Available: http://ideas.repec.
org/p/wbk/wbrwps/1881.html
Shawn Mankad received a B.S. in Mathematics
and Statistics from Carnegie Mellon University in
2008, and an M.A. in Statistics from the University of Michigan in 2012. He is working toward
a Statistics PhD at the University of Michigan
under the supervision of George Michailidis. His
research interests include space-time models for
information extraction and visualization, network
estimation, and analytical techniques with applications to Economics, Finance, complex systems, among others.
George Michailidis received his Ph.D. in Mathematics from UCLA in 1996. He was a postdoctoral fellow in the Department of Operations
Research at Stanford University from 1996 to
1998. He joined The University of Michigan in
1998, where he is currently a Professor of Statistics, Electrical Engineering & Computer Science.
He is a Fellow of the Institute of Mathematical
Statistics and of the American Statistical Association and an Elected Member of the International
Statistical Institute. He has served as Associate
Editor of many statistics journals including Journal of the American
Statistical Association, Technometrics, Journal of Computational and
Graphical Statistics. His research interests are in the areas of analysis
and inference of high dimensional data, networks and visualization.
Download