slides

advertisement
Constrained Graph Construction Problems
in Network Modeling
Zoltán Toroczkai
Department of Physics, University of
Notre Dame
Collaborators:
Sponsors:
Szabolcs Horvát (U. Notre Dame)
István Miklós (Rényi Inst. Math.)
Peter L. Erdős (Rényi Inst. Math.)
Kevin E. Bassler (U. Houston)
Charo del I Genio (U. Warwick)
Hyunju Kim (Arizona State)
László Székely (U. South Carolina)
Éva Czabarka (U. South Carolina)
Chess Puzzle: Swap the positions of white knights with those of the black knights
a
1
2
3
4
b
c
d
a
b
c
d
I. Network representation
1
II. Redirected the process of
thought to the “where” pathway.
2
3
4
b1
c3
a2
d1
c4
b2
c1
d3
b3
a1
c2
b1
c3
a2
d1
c4
b2
c1
b3
a1
c2
d3
Not optimal: it only respects
the relationships.
Optimized layout: minimal edge crossings,
minimal wire length.
This representation allows us to infer and
exploit GLOBAL Information quickly.
Global information is necessary for finding solutions fast (esp. NP-hard problems).
“Dumb” algorithms: representation independent  very inefficient.
“Smart” algorithms: exploit the structure of the data / global information.
How do we know that there is global information in a dataset?
How do we extract it?
OPEN!
This is very typical, e.g. :
Interareal network in the macaque cortex
b1
c3
a2
d1
c4
b2
c1
d3
N.T. Markov, et al. Science 342(6158), 1238406 (2013)
b3
a1
c2
I understand a network if I can generate it (or similar versions of it).
We are looking for the essential factors that generate the global information within
the structure.
Essential factors
may appear through
Constraints
Indeed, for the cortical network:
Wiring costs and cortical geometry
+
M. Ercsey-Ravasz et al., Neuron
80, 184-197 (2013).
N.T. Markov, et al. Science
342(6158), 1238406 (2013)
- Many features and network measures captured
- What is not captured: noise, or structures that need new constraints/info
 Data Driven Network Modeling
constraints
ensemble
Typical scenarios
Data
Want
o Partial info
o Good guesses about the rest
o Complete
o Plausible constraints capturing the data
Constraints can be imposed:
• precisely/verbatim – Sharp constraints
• “softly”, via ensemble averages – Average constraints
This setting defines a set of fundamental problems related to ensemble-based
modeling of complex networks.
 Sharp Constraints
Consider: - the set of all simple graphs on N nodes:
(
)
- a set of graph measures, or observables (the constraints):
Def. 1 : Sharply constrained ensemble:
i.e., all members of the ensemble have the same values precisely for the corresponding graph
measures as given by the constraints.
There are 4 main problem classes related to network modeling with sharp constraints:
 Existence:
Under what conditions on
 Construction:
How to build any (or all) members of
 Sampling:
How to sample by some distribution (uniformly) members of
 Counting:
How to compute or estimate
,
?
?
?
?
Typically studied problems:
 Degree Sequence
Specifies the number of neighbors for all nodes.
for undirected graphs
for directed graphs
for bipartite graphs
 Joint Degree Matrix (JDM)
A JDM specifies the number of edges between nodes of given degrees, for all degree pairs.
Partition the nodes into groups of given degrees (classes):
Then:
A JDM is a stronger constraint than the degree One can think of the JDM as specifying “two-point
correlations” as well between nodal degrees.
sequence which it also determines uniquely:
Applications are for e.g., in social networks which are
distinguished by positive degree correlations
(assortative networks).
A.N. Patrinos & S.L. Hakimi. Discr. Math. 15, 347 (1976). I. Stanton & A. Pinar. ACM J. Exp. Alg. 17(3), 3.5 (2012).
 Existence
Def. 2 : If
, we say that the constraint
is said to realize
 Degree Sequence
.
is graphical. Any graph
in this case is called a graphical realization of
.
Well known, characterized.
Erdős-Gallai (EG)/Fulkerson-Ryser type theorems
E.g., 1)
must be even and 2)
must hold for all
Havel-Hakimi algorithm: Given a graphical sequence, choose a node , and connect all its stubs to other
nodes with the largest residual degrees. Repeat until all stubs are connected into edges.
 Joint Degree Matrix (JDM)
Theorem. A
1)
matrix
is a graphical JDM iff:
2)
3)
E. Czabarka, A. Dutle, P.L. Erdos, I. Miklos. Disc. Appl. Math. 181, 283 (2015).  a clean and short proof to
this EG type theorem.
Others have also provided similar characterizations (Stanton-Pinar, Amanatidis-Green-Mihail)
 Construction
o Direct construction: sequentially connect stubs (half-edges).
 How do we build any graph from
? Efficiently?
o Switches/Swaps: start from a realization
then move edges around via some
operations (e.g., edge swaps/switches) to arrive at another member
.
 What operations guarantee that all members
can be reached this way?
 Degree Sequences
o Direct construction
Theorem (KTEMS): Provides necessary and sufficient conditions for graphicality of degree sequences that
are restricted with forbidden edges forming a k-star on an arbitrary node i :
- non-edges (forbidden links)
Undirected graphs:
H. Kim, Z. Toroczkai, P.L. Erdös, I. Miklós and L.A. Székely. J. Phys. A: Math. Theor. 42,
392001 (2009).
J. Blitzstein, P. Diaconis. Internet Mathematics, 6(4), 487–520 (2010)
Directed graphs:
P.L. Erdös, I. Miklós and Z. Toroczkai. Elec. J. Comb. 17(1), R66 (2010).
o Switches/Swaps
1
2
1
2
3
4
3
4
Swap the ends of two independent edges (2-swap):
- This preserves the degree sequence and connects
(Ryser)
- Start from a graphical realization (e.g., H-H made), then do 2-swaps.
 Joint Degree Matrix (JDM)
o Direct construction
Def. : Let
vector
be the degree of node
towards
. The degree “spectrum” of node
is the
.
- Generate a degree spectrum (any), then build all bipartite graphs between the degree classes, then
create all simple graphs within every degree class.
P.L. Erdős, I. Miklós, C. I. Del Genio, K.E. Bassler & Z. Toroczkai. New. J. Phys. 17, 083052 (2015).
o Switches/Swaps
1
2
1
2
Same degree class
- Restricted Swap Operation (RSO):
- The RSO preserves the JDM and connects
3
4
É. Czabarka, A. Dutle, P.L. Erdős, I. Miklós. Disc. Appl. Math. 181, 283 (2015).
3
4
 Sampling
o Direct construction based importance sampling
o Markov Chain Monte Carlo (MCMC) based on switches
- Sample a
in
steps (“in poly-time”).
Requirements:
- Obtain pseudo-random realizations via MCMC switching in poly-mixing time.
 Degree Sequence
o Direct construction based
C.I. Del Genio, H. Kim, Z. Toroczkai and K.E. Bassler. PLoS ONE, 5(4) e10012 (2010).
H. Kim, C.I. Del Genio, K.E. Bassler and Z. Toroczkai. New J. Phys. 14, 023012 (2012).
- undirected
- directed
o MCMC based on edge swaps This is the most studied, in particular the Mixing Time Problem
Definitions:
“supergraph” whose nodes are all the graphical realizations in
A “super-edge”
means that a 2-swap in the graph
.
takes it to graph .
a Markov chain with transition matrix
The MCMC is a random walk on
with probability transition matrix
.
Let
be the eigenvalues of
and
Thus to show fast mixing one needs to find a polynomial upper bound (in the size of the graphs N –
nr of nodes) on the mixing time, or the relaxation time:
Conjecture (Kannan, Tetali, Vempala, 1999):
This is still open!
The switch MCMC based on 2-swaps mixes rapidly over the set of all realizations of any graphical degr
sequence.
- They have shown it only for regular bipartite graphs (same degrees everywhere).
R. Kannan, P. Tetali and S. Vempala. Rand. Struct. Alg. 14 (4),
293-308 (1999)
- Cooper, Dyer and Greenhill has shown it for arbitrary regular undirected graphs .
C. Cooper, M. Dyer and C. Greenhill. Comp. Prob. Comp. 16 (4), 557-593
(2007)
- Greenhill proved it for regular directed graphs.
C. Greenhill. Electronic J. Comb. 16 (4),
557-593 (2011)
- C. Greenhill proved it for general bounded maximum degree undirected graphs
Proc. 26th ACM-SIAM Symp. Discr. Alg., New York-Philadelphia, pp. 1564-1572 (2015).
http://arxiv.org/abs/1412.5249
Additionally:
1) Miklós, Erdős and Soukup have just proved it for half-regular bipartite graphs
I. Miklós, P.L. Erdős & L. Soukup. Electronic J. of Comb. 20 (1), #P16, 1-51, (2013).
(A very technical proof on over 50 pages).
Can we generate graphs uniformly at random that realize a given graphical degree sequence
such that all realizations avoid creating edges from a forbidden subgraph?
They answered this question affirmatively for the following constraints:
where
is a half-regular bi-degree sequence such that
and
are arbitrary for
is a k-star centered on node .
is a 1-factor (a perfect matching) between the two node classes
Theorem: There is switch MCMC that is mixing fast (in poly-time) in the state space of all realizations
.
P.L. Erdős, S.Z. Kiss, I. Miklós and L. Soukup. PLOS ONE, #e0131300 (2015). http://arxiv.org/abs1301.7523v2.
 Joint Degree Matrix (JDM)
Theorem: The space of all realizations
of any given JDM is connected via RSOs.
 The RSO-based MCMC is irreducible.
É. Czabarka, A. Dutle, P.L. Erdős, I. Miklós. Disc. Appl. Math. 181, 283 (2015).
Question: is the RSO-based MCMC mixing rapidly (poly-time in N ) on the set of all realizations of a JDM?
Theorem: The restricted swap operation Markov chain mixes rapidly over the balanced realizations of
any JDM, i.e.,
, where N is the number of nodes.
Def. : A realization of a JDM is balanced iff for all
:
P.L. Erdős, I. Miklós & Z. Toroczkai. SIAM Discr. Math. 29, 481 (2015) . http://arxiv.org/abs/1307.5295
All graphical JDMs admit balanced realizations.
A JDM realization is balanced if the degrees of nodes within a degree class towards another degree class are as uniformly
distributed as possible and this is true for all degree classes.
 Counting
Compute or estimate
 How constraining (or “non-random”) is
 Computational hardness:
Counting
U. Sampling
?
“A is harder than B”:
Construction
Existence
Def. 3:
Fully Polynomial Almost Uniform Sampler (FPAUS):
(sampling)
- An MCMC algorithm that generates graph samples almost uniformly, in poly-time.
Fully Polynomial Randomized Approximation Scheme (FPRAS):
- An algorithm that estimates
(counting)
in poly-time.
• M.R. Jerrum, L.G. Valiant and V.V. Vazirani. Theor. Comput. Sci. 43, 169 (1986).
FPRAS
FP Exact U Sampler
•V.V. Vazirani. Approximation Algorithms. Springer (2003).
• http://www.cc.gatech.edu/~vigoda/MCMC_Course/Sampling-Counting.pdf
Def. 4 : A problem is self-reducible if the solutions to any of its instances can be recursively generated
from solutions to smaller instances of the same problem, s.t. the number of branches at each recursion
step is polynomially bounded by the size of the problem instance.
Implications:
Exact Counter
M.R. Jerrum, L.G. Valiant and V.V. Vazirani. Theor. Comput. Sci. 43, 169 (1986)
Exact U Sampler
Thus, if we have an an FPAUS we can
estimate efficiently
FPRAS
FPAUS
The classical degree-based graph construction problem is not self-reducible.
Theorem: The degree sequence problem constrained by a 1-factor and a k-star is self-reducible.
P.L. Erdős, S.Z. Kiss, I. Miklós and L. Soukup. PLOS ONE, #e0131300 (2015). http://arxiv.org/abs1301.7523v2.
This implies that that an FPRAS can be constructed allowing to estimate
.
 Soft Constraints: Maximum Entropy Ensembles
Soft constraints:
Find a distribution P(G) over the set of all graphs
such that the ensemble average obeys:
Graph measures:
E.g:
# of edges |,
# of
, ...
There are many ways to choose
probabilities P(G) that satisfy these!
How do we choose the P(G) ?
- are the constraints, e.g. given by data.
E. T. Jaynes, Physical Review 106, 620 (1957).
The Maximum Entropy Principle:
Choose the distribution that maximizes the information entropy
subject to the constraints
and
where
.
The parameters control
.
In practice, is typically found
numerically for a given
Equivalent treatment: use distributions over measures instead of over graphs:
,
nr of graphs in
with property
.
.
(# of edges)
sparse
dense
The Degeneracy problem (Strauss 1986):
The sampled graphs may not be representative of the
averages.
This happens when
o How does
is not unimodal.
(# of edges)
become bimodal/multimodal ?
o What can we do to eliminate/minimize this issue?
Using the MaxEnt:
Example:
terrorist cells
Exact enum:
pairs interacted
triples collaborated
The probability that the 9 cells
form a connected network?
What is the most likely network?
none connected!
Disctd.
Conctd.
Yet MaxEnt says that it is connected with 0.6 probability!
(but none has 17 edges and 19 triangles!)
MaxEnt has been used extensively:
It is applicable to systems of any size
E. T. Jaynes, Physical Review 106, 620 (1957); ibid.
108, 171 (1957)
 Tool to study mesoscale systems!
R.V. Chamberlin. Phys. Rev. Lett. 82, 2520 (1999);
R.V. Chamberlin. Nature 408, 337 (2000).
Nanothermodynamics:
R.V. Chamberlin. Science 298, 1172 (2002);
R. Balian. From Microphysics to Macrophysics: Methods
and Applications of Statistical Physics (Springer) 2007.
Many applications:
- Image reconstruction: S.F. Gull, G.J. Daniell. Nature 272, 686 (1978) [real-space images from x-ray scattering data]
Fluorescence of L-tryptohan: A.K. Livesey, J.C. Brochon. Biophys. J. 52, 693 (1987)
- Conformational states of poly-(L-proline) from single molecule Foester energy transfer resonance data:
L.P.. Watkins, H. Chang, H. Yang. J. Phys. Chem. A 110, 5191 (2006).
- Folding kinetics of dihydrofolate reductase: P.J. Steinbach, R. Jonescu, C.R. Matthews. Biophys. J. 82, 2244 (2002).
- CO ligand rebinding to a heme protein: P.J. Steinbach, K. Chu, H. Frauenfelder, et al. Biophys. J. 61, 235 (1991).
- Gene regulatory networks: A.M. Walczak, G. Tkacik, W. Bialek. Phys. Rev. E. 81, 041905 (2010).
- Infotaxis of moths: M. Vergassola, E. Villermaux, B.I. Shraiman. Nature 446, 406 (2007).
- And many many others....
THEOREM: The MaxEnt model is non-degenerate if and only if the density of states
function
is log-concave .
Sz. Horvát, É. Czabarka, & Z. Toroczkai. Phys. Rev. Lett., 114 158701 (2015).
For the terrorist network
# of two-stars:
Another example
# of edges:
A solution:
We still use the same data
as in the degenerate
model, however, we
consider a one-to-one transformation
such that the corresponding density of states function
is log-concave and thus the corresponding
Can still work in the same coordinate system
non-degenerate model
How to choose
model is non-degenerate.
but the states are sampled by the
with constraints
?
The typical reason for why
is not log-concave
is because its domain is not convex.
Any transformation
that convexifies the domain is good!
.
(⟨m|⟩, ⟨mv⟩) model
(⟨m|2⟩, ⟨mv⟩) model
A data network example: Zachary’s Karate Club (ZKC)
Consider
Fit:
is degenerate!
After linearization to
obtain a convex domain.
Let us try to predict the number of triangles
Recall:
The distribution of triangles by the same model is also bimodal.
Both
and
appear with very low probability in this model.
The linearized (or convexified) model
It predicts:
Both
and
appear with high
probability in this model.
produces a unimodal distribution.
Download