Eigenvector Localization on Data-Dependent Graphs

advertisement
Eigenvector Localization on Data-Dependent Graphs
Alexander Cloninger and Wojciech Czaja
Norbert Wiener Center
Department of Mathematics
University of Maryland, College
Email: {alex,wojtek}@math.umd.edu
Abstract—We aim to understand and characterize embeddings
of datasets with small anomalous clusters using the Laplacian
Eigenmaps algorithm. To do this, we characterize the order in
which eigenvectors of a disjoint graph Laplacian emerge and
the support of those eigenvectors. We then extend this characterization to weakly connected graphs with clusters of differing
sizes, utilizing the theory of invariant subspace perturbations
and proving some novel results. Finally, we propose a simple
segmentation algorithm for anomalous clusters based off our
theory.
I. I NTRODUCTION TO G RAPH T HEORY IN D IMENSION
R EDUCTION
Many nonlinear dimensionality reduction techniques, such
as Laplacian Eigenmaps [1], Diffusion Maps [5], and Local
Linear Embedding [9], center on building a data-dependent
graph. This allows one to look at similarities between data
points as a way to extract useful relationships. We shall focus
on the Laplacian Eigenmaps embedding technique.
The purpose of Laplacian Eigenmaps, as with all non linear
dimensionality reduction techniques, is to create a mapping
φ : Rd → Rm , where m is the inherent dimension of the
underlying data. Let Ω = {x1 , ...xn } ⊂ Rd be a set of training
points. We have a positive, symmetric kernel K : Ω × Ω → R
that encodes relationships between two points. We define a
neighborhood N (x) ⊂ Ω of each x ∈ Ω to be the k closest
points to x, as measured by the kernel K.
We construct a graph G = (Ω, E), where {xi , xj } ∈ E
if xj ∈ N (xi ). Let A be the adjacency matrix of G. Then
A is sparse, with row Ai,· containing k non-zero entries. We
require A to be symmetric, though whether A is symmetric
depends on how the neighborhoods are generated. If A is not
ei,j = max(Ai,j , Aj,i ).
symmetric, simply define the weights A
Note that, if we do not symmetrize A, it would be exactly
an adjacency matrix for a k-regular graph due to the nearest
neighborPcondition. We define the diagonal matrix D such that
Di,i = j Ai,j , and we define the graph Laplacian as L =
D − A. Finally, we solve the normalized eigenvalue problem
1
1
D− 2 LD− 2 φ = λφ.
(1)
The smallest eigenvalue is 0 and its associated eigenvector
is left out, as the eigenvector is constant across all nodes
of a connected component of a graph. The m eigenvectors
corresponding to the next smallest m eigenvalues are used to
c
978-1-4673-7353-1/15/$31.00 2015
IEEE
form the embedding into Rm . In other words, if {φi }m
i=1 are
the eigenvectors associated with eigenvalues {λi }m
i=1 , then
φ(xi ) = (φ1 (i), ..., φm (i)).
It it worth mentioning that {φi }m
i=1 are orthogonal, due to the
1
1
fact that D− 2 LD− 2 is self-adjoint.
Analysis of the performance of Laplacian Eigenmaps generally focuses on the assumption that the data lies on a
smooth manifold and that the graph Laplacian approximates
the Laplace-Beltrami operator on that manifold [2]. Focus is
also given to the similarity kernel applied to the data, in order
to generate a more faithful embedding [11]. This paper instead
proposes to study these operators from the context of graph
theory.
Graph theoretic analysis of dimension reduction allows the
results to be independent of distance metric or local geometry
of the data. An example of this is shown in Figure 1. These
two data sets can differ in geometry, and even in dimension.
However, the individual points relate to each other in a similar
manner, and both data sets generate virtually identical graph
Laplacians. Namely, both graphs consist of two nearly disjoint
clusters.
(a) Gaussian
(b) Gaussian Adj.
(c) Two Moon
(d) Two Moon Adj.
Fig. 1. Two datasets and the adjacency matrices of their data-dependent
graphs, sorted into clusters for visualization purposes. Gaussian adjacency:
µ = 23.7, σ = 1.64, Two moon adjacency: µ = 23.4, σ = 2.23
These clusters can be described graph theoretically. Let
us simplify the context slightly, and assume that the kernel
K(x, y) is an indicator function of whether x and y are
nearest neighbors. We shall define a cluster on n points as
being a randomly chosen k-regular graph with n nodes. This
is because each row of the adjacency matrix has k non-zero
entries due to the k-nearest neighbors algorithm for choosing
edges. Also, the edges within the cluster will be randomly
distributed between other nearby points in the cluster, as there
is no structural difference between the clustered points, except
for noise and slight variability. This means the degree of each
clustered point will be k, and the distribution of those weights
is independent of which point in the cluster is chosen.
Definition I.1. The family of k-regular graphs Gn,k is the
set of all graphs
P G = (V, E) with n nodes and ∀x ∈ V ,
deg(x) ≡
wx,y = k.
{x,y}∈E
We chose k = 25 nearest neighbors for the examples in
Figure 1. For both graph Laplacians, the two clusters are
almost completely disjoint. Also, we report the mean (µ) and
standard deviation (σ) of the degree of the nodes. Clearly, all
of the nodes have almost identical degree for both graphs, and
are fairly close to being k-regular graphs.
For the rest of this paper, we shall examine results about
graphs with k-regular subgraph clusters, specifically when the
clusters are of differing sizes. Also, we shall assume the kernel
k(x, y) is an indicator function of whether x and y are nearest
neighbors. This shall allow us to approximate the behavior of
Laplacian Eigenmaps by utilizing the vast literature that exists
on regular graphs.
supp(φi0 ) ⊂ C1 . This means φi0 (x) = 0 for x ∈ C2 . However,
∃x, y ∈ C1 such that φi0 (x) < 0 and φi0 (y) > 0. Thus, a
separating line on φi0 would be unable to differentiate C1
from C2 . Since most of the energy of φ(Ω) lies in C1 , most
i ∈ {1, ..., m} satisfy supp(φi ) ⊂ C1 .
B. Eigenvector Distribution for Unions of Regular Graphs
The logic behind the phenomenon in Figure 2 is based on
the distribution of eigenvalues of the Laplacian. Specifically, it
depends on the interlacing of eigenvalues of k-regular graphs.
The first significant progress in this problem came 30 years ago
in a paper by McKay [8]. He showed that, given a sequence of
regular graphs with the number of nodes tending to infinity, the
empirical spectral distribution of the scaled adjacency matrix
√ 1 An converges to a semicircle
k−1
fd (x) =
II. E IGENVECTOR D ISTRIBUTION FOR D ISJOINT
C LUSTERS WITH H ETEROGENEOUS S IZES
For Laplacian Eigenmaps, the common assumption is that
one only needs to keep the m d smallest eigenvectors
to create a faithful embedding. However, the choice of m is
commonly overlooked, other than assuming m must be at least
as large as the intrinsic dimensionality of the data.
A. Example of Eigenvector Distribution
A general approach to choosing m is deciding on the intrinsic dimension of the data. However, Figure 2 demonstrates
the choice of m is more complicated. The data consists of
two clusters in R2 , with cluster C1 containing 10,000 points,
and cluster C2 containing 1000 points. Laplacian Eigenmaps
is run on this example with a Gaussian kernel and 50 nearest
neighbors. The images below show the eigenvectors with the
14 smallest non-zero eigenvalues. Observe that, due solely
to the difference between their clusters, all but one of the
eigenvectors have their entire energy concentrated in C1 .
Original
1st Eig
...
12th Eig
13th Eig
Fig. 2. First image shows the two original clusters, followed by the support
of each eigenvector of the graph Laplacian. Notice that the first appearance
of the smaller cluster does not occur until the 13th eigenvector.
This can be problematic for a number of reasons. For one,
there are no intercluster features in the data. However, 13 of the
first 14 eigenvectors are picking up erroneous features in C1 .
This can lead to issues when the embedded points φ(Ω) are
inputs to a clustering algorithm such as k-means or support
vector machines. These erroneous features are given undue
weight in clustering, leading to errors in classification.
Second, despite C2 constituting a significant portion of
the data, almost all the energy in φ(Ω) is concentrated in
C1 . Again, this poses problems for clustering and classification algorithms. To see this, fix i0 ∈ {1, ..., m} such that
1 p
4 − x2 , −2 < x < 2.
2π
(2)
Following this result, the necessity to avoid cycles was
removed in exchange for proving results about random regular
graphs. Also, it raised the question of whether such convergence results could be made for finite n. It wasn’t until 2013
that results were proved in this case by Dumitriu and Pal [7].
Theorem II.1. (Theorem 2, [7]) Fix δ > 0 and let k =
(log(n))γ , and let η = 21 (exp(k −α ) − exp(−k −α )) for
0 < α < min(1, 1/γ). Then there exists an N large enough
such that ∀n > N , for G ∈ Gn,k chosen randomly with
adjacency matrix A, for any interval I ⊂ R such that
|I| ≥ max{2η, η/(−δ log δ)},
Z
|NI − n fd (x)dx| < nδ|I|
I
with probability at least 1 − o(1/n). Here, NI is the number
1
of eigenvalues of √k−1
A in the interval I and fd is the
semicircle law in (2).
Using Theorem II.1, we begin to address the phenomenon
that occurs in Figure 2. We give a theorem characterizing the
order in which eigenvalues and eigenvectors concentrated on
either C1 or C2 emerge from a graph Laplacian of a disjoint
graph.
Theorem II.2. Let Γ = (Ω, E) be an undirected graph. Suppose Ω can be split into two disjoint clusters C1 and C2 such
that, for the subgraph G1 generated by C1 and the subgraph
G2 generated by C2 , G1 ∈ Gn,k and G2 ∈ G Dn ,k . Furthermore,
assume @{x, y} ∈ E such that x ∈ C1 and y ∈ C2 . Fix
δ, k, α, and η as in Theorem
II.1. Choose any interval
√
max{2η,
η/(−δ log δ)}. Let
I ⊂ [0, 2] such that |I| ≥ k−1
k
L denote the graph Laplacian, and σ1 , ..., σm denote the m
eigenvalues of L that lie in I. Then there exists an orthonormal
basis {v1 , ..., vm } of associated eigenvectors such that, if
NI1 = |{i : supp(vi ) ⊂ C1 }| and NI2 = |{i : supp(vi ) ⊂ C2 }|,
then NI1 + NI2 = m and there exists some N such that
∀n > N ,
|NI1 − DNI2 | ≤ 2δn √
k
|I|
k−1
with probability at least 1 − o(1/n) over the choice of
subgraphs G1 and G2 . Moreover, m satisfies
Z
m − (n + n ) fd (x)dx < δ(n + n ) √ k |I|.
D
D k−1
I
III. W EAKLY C ONNECTED C LUSTERS WITH
H ETEROGENEOUS S IZES
In real datasets, it is unlikely that clusters are disjoint.
However, Theorem II.2 begins to describe eigenvector localization for Laplacian Eigenmaps. The next question that
arises concerns the behavior of weakly connected clusters with
heterogeneous sizes, i.e., graphs with a small number of edges
between the clusters. This characterizes a larger and more
realistic class of data analysis problems.
Definition III.1. A graph with weakly connected clusters of
order t is a connected graph with adjacency matrix
A1 B1,2
A=
,
|
B1,2
A2
where B1,2 has t non-zero entries, and A1 and A2 are
adjacency matrices of k-regular graphs.
We now restate the eigenvector localization problem on
a graph with weakly connected clusters as a problem of
matrix perturbation. Consider two graphs H and G, where
H is a disjoint regular graph that satisfies the assumptions
of Theorem II.2, and G is a graph with weakly
connected
A1 0
clusters of order t. In other words, AH =
and
0 A2
A1 B1,2
AG =
. Then, AG = AH + B, where B
|
B1,2
A2
is a block 2 × 2, 2t sparse adjacency matrix that only has
terms on the block off-diagonal. Clearly, one can see AG as a
perturbed version of AH , and the eigenvalues and eigenvectors
of AH are completely characterized by Theorem II.2. This
makes perturbation theory a valid approach to showing that the
eigenvalues and eigenvectors of AG (and the graph Laplacian
LG ) do not deviate much from the known quantities of AH .
A. Eigenvalue Distribution
First, we shall consider the eigenvalue distribution of this
new perturbed matrix. Theorem III.2 is a variant of Weyl’s
inequality, which says the eigenvalues of a perturbed matrix
A + E deviate at worst by the largest eigenvalue of the error
matrix E [14]. This variant, which relies heavily on results
from [4] for normalized graph Laplacians, relates specifically
to graphs with weakly connected clusters.
Theorem III.2. Let Γ = (Ω, E) be a graph with weakly
connected clusters of order t, such that one cluster is of size
n
. Fix δ, k, α, η, and I as
n and the other cluster is of size D
in Theorem II.2.
Let L denote the graph Laplacian, and σ1 , ..., σm denote
the m eigenvalues of L that lie in I. Then m satisfies
Z
n
k
n
|m − (n + ) fd (x)dx| < δ(n + ) √
|I| + 2t,
D I
D k−1
again with probability at least 1 − o(1/n).
B. Eigenvector Distribution
Now, we shall consider the eigenvector distribution of a
graph with weakly connected clusters by considering it as
a matrix perturbation problem. Davis and Kahan [6] were
the first to give general theorems relating to the invariant
subspaces of two Hermitian matrices. These results were
extended by Stewart [12] via an iterative process for generating
the invariant subspaces.
These theories center around the distribution of eigenvalues
and eigenvalue gaps.
Definition III.3. The eigenvalue separation of two n × n
A
matrices A and B with spectrum σ(A) = {λA
1 , ..., λn } and
B
B
σ(B) = {λ1 , ..., λn } is defined as
B
sep(A, B) = min |λA
i − λj |.
i,j
Theorem III.4. (Theorem 4.11, [12]) Let A, E ∈ Cn×n . Let
X = [X1 , X2 ] be a unitary matrix with X1 ∈ Cn×l , and
suppose R(X1 ) is an invariant subspace of A. Let
A1,1 A1,2
E1,1 E1,2
X ∗ AX =
, X ∗ EX =
.
0
A2,2
E2,1 E2,2
Let δ = sep(A1,1 , A2,2 ) − kE1,1 k − kE2,2 k. Then, if
1
kE2,1 k(kA1,2 k + kE1,2 k)
≤ ,
2
δ
4
there is a matrix P satisfying
kP k ≤ 2
(3)
kE2,1 k
,
δ
such that
f1 = (X1 + X2 P )(I + P ∗ P )−1//2
X
(4)
is an invariant subspace of A + E.
Theorem III.4 gives a sufficient condition for guaranteeing
that an eigenspace X1 remains relatively preserved under
perturbation. Under the condition that A is a graph with
weakly connected clusters, and sep(A1,1 , A2,2 ) 6= 0, Theorem
III.4 gives us bounds on the individual eigenvectors under
perturbation.
This type of theorem is an approach to showing that
the eigenvectors of a graph with weakly connected clusters
remains localized. It implies that greater eigenvalue separation
leads to better eigenvector localization. However, the conditions that need to be satisfied are too strict for our problem,
given that σ(LG ) ⊂ [0, 2] regardless of the number of points.
To demonstrate this disparity between theory and example,
consider the two moons dataset from Figure 1(c). In this
dataset, there are 7 edges connecting C1 and C2 . |C1 | = 1989
and |C2 | = 211, meaning |C1 | = D · |C2 | where D = 9.4.
We shall examine the smallest 10% of eigenvalues and their
associated eigenvectors, as these are the vectors that are most
commonly chosen for the Laplacian Eigenmaps algorithm.
Let LG be the Laplacian of the graph with weakly connected clusters, and LH be the Laplacian of the graph with
disjoint clusters. Let {v1 , ..., v200 } be the eigenvectors of
LG and {w1 , ..., w200 } be the eigenvectors of LH . Figure 3
plots hvi , wi i for i ∈ {1, ..., 200}. Clearly, there is a large
discrepancy between theory and practice. Theorem III.4 only
predicts 26 eigenvectors satisfy the assumptions of the spectral
gap necessary to guarantee
(4) holds for kP k < 1, which
√
. However, 180 of the eigenvectors
would imply hvi , wi i > 22√
actually satisfy hvi , wi i > 22 .
(a) Vector Angles
(b) Vector Angles Predicted by
Theorem III.4
Fig. 3. Actual Vector Angles hvi , wi i for the first 200 eigenvectors of data
from Figure 1(c) versus Predicted spectral gap from Theorem III.4.
A more enlightening depiction of this discrepancy can be
seen in Figure 4. This is another plot of the vector angles (same
as Figure 3(a)), except now the indices {i : supp(wi ) ⊂ C2 }
are marked with a vertical line. Recall from Theorem II.2, this
occurs on average once out of every D indices.
the perturbed matrix A + E, where x = [x1 , ..., xn ]. Then
X
j∈C c
|xj |2 ≤
e − λi )x − Exk2
k(λ
2
.
min(λi − λi−s , λi+s − λi )2
Theorem III.5 demonstrates that, when there exists a series
of eigenvectors concentrated on a subset of the points C, then
the concentration of the new, corresponding eigenvectors is
inversely proportional to the square of the eigenvalue gap. It
is related to results from [10].
Using Theorem III.5, we attempt to predict the number of
eigenvectors from Figure 1(c) that remain concentrated in the
appropriate cluster. Recall
√ that 180 of the first 200 eigenvectors
satisfied hvi , wi iP> 22 , which is a similar condition to
predicting that
|xj |2 < .5. Theorem III.5 predicts that
j∈C c
P
130 of the eigenvectors will satisfy
|xj |2 < .5 and remain
j∈C c
concentrated in their respective clusters. While this is less than
the 180 that actually remain localized, the prediction of 130 is
far better than the prediction of 26 that occurs using Theorem
III.4. More importantly, of those 130 predicted eigenvectors,
127 are concentrated on the larger cluster C1 . Only 3 are
concentrated on the smaller cluster C2 .
IV. R ESULTS OF E IGENVECTOR C ONCENTRATION
T HEOREMS
A. Interpretation of Results
Fig. 4.
Vector Angles for first 200 eigenvectors of data from Figure
1(c), with green vertical lines denoting eigenvalues for which λi ∈ {λi :
supp(wi ) ⊂ C2 }. Blue dot: {hwi , vi i : supp(wi ) ⊂ C1 }, Red dot:
{hwi , vi i : supp(wi ) ⊂ C2 }.
Notice that the only deviation hvi , wi i makes from being
close to 1 occurs on or near the indices for which supp(wi ) ⊂
C2 . This suggests why Theorem III.4 is not sufficient for the
current setting. Theorem III.4 gives a condition for which
kP k < 1. However, it does not speak to which eigenvectors
f1 in (4), regardless of whether (3) is
from X2 contribute to X
violated.
Figure 4 suggests that the eigenvectors from X2 that conf1 are exactly those that are nearest in eigenvalue.
tribute to X
This is why points near a vertical line for λi ∈ {λi :
supp(wi ) ⊂ C2 } are less robust to perturbation. This leads
to the following theorem.
Theorem III.5. Let A be a symmetric n × n matrix with
eigendecomposition A = V ΣV ∗ . Let (λi , vi ) be an eigenpair
of A. Partition V by ordering the eigenvalues such that V =
[V1 , V2 , vi , V3 , V4 ] where V2 , V3 ∈ Rn×s . Moreover, assume
∃C ( {1, ..., n} such that supp(vi ) ⊂ C and supp(vj ) ⊂ C
e x) an eigenvector of
where vj is a column of V2 , V3 . Let (λ,
Let us consider the results of Theorem III.5. This eigenvector concentration result, along with the disjoint graph results
from Theorem III.5, suggest a negative result for differentiating small clusters C2 from a larger background cluster
C1 using Laplacian Eigenmaps. These results would suggest
that small clusters are forced to 0 for most eigenvectors of
the graph Laplacian. This makes classification, and especially
determining inter-cluster differences in C2 , very difficult.
On top of that, Theorem III.5 suggests that even if LH
has an eigenvector wi supported on C2 , its corresponding
eigenvector vi of LG for the graph with weakly connected
clusters may not remain supported on C2 . This is because,
while supp(wi ) ⊂ C2 , supp(wi−1 ) and supp(wi+1 ) are most
likely concentrated on C1 due to Theorem II.2.
B. Experimental Demonstration of Results
The theories throughout this paper are descriptive in nature,
meaning the goal is to characterize the embeddings. The main
takeaway from these results is that the current method of
selecting the number of m eigenvectors kept fails to take into
account the uneven distribution of eigenvectors concentrated
on small clusters of anomalous data.
However, it is still possible to experimentally verify this
descriptive theory and possibly gain an algorithmic advantage.
Consider a very noisy version of two moons dataset (see
Figure 5). We shall call the graph G = (Ω, E). Once again,
the larger cluster C1 is much larger than the smaller cluster
C2 . Specifically, we pick |C1 | = 9, 500, and |C2 | = 500. One
question is how to separate these two clusters in an optimal
way. One approach would be to use a graph cut algorithm.
This is a method that finds a set V ⊂ Ω that minimizes
|E(V, V c )|
,
V ⊂Ω |V | · |V c |
RatioCut(Ω) = arg min
where E(V, V c ) = {{x, y} ∈ E : x ∈ V, y ∈ V c }. While
this problem is non-convex, a common approximation involves
thresholding the second eigenvector of the graph Laplacian via
a function sgn(φ2 (Ω) − φ2 (Ω)), where φ2 (Ω) is the mean of
the second eigenvector [13].
Unfortunately, even though V = C2 would be the correct
split of the data, |C2 | is too small to achieve the minimum of
RatioCut. This is reflected in Figure 5.
Instead, we can take into account the knowledge that
anomalous clusters are forced to zero for most eigenvectors
φi . One simply calculates a new scoring function
ρm, (x) = #{i ∈ {1, ..., m} : |φi (x)| < },
(5)
and uses ρm, : Ω → R to cluster the data points into
two clusters. We compare this scoring function to eigenvector
thresholding for the graph cut in Figure 5. The scoring function
in (5) clearly gives the correct clustering, with an error rate
of only 0.03%.
81st smallest. This is consistent with our theory, given that
|C1 |/|C2 | = 50.
Figure 6(b) shows the graph cut generated by the smallest
non-zero eigenvector, and Figure 6(c) shows a class separator
generated by (5) on 25 eigenvectors. Using (5) gives a perfect
classifier for the anomalous cluster, whereas the graph cut does
not reflect the anomalous cluster in any way.
(a) Source
(b) Graph Cut Classifier
(c) ρ25,0.001 (x)
Fig. 6. (a) Data points from flight path. The points in red mark the locations
of the anomalous radiological signature. (b) Nuclear data class separator using
graph cut. (c) Nuclear data class separator using (5).
VI. C ONCLUSIONS
This paper proves that Laplacian Eigenmap embeddings
rarely concentrate on small, anomalous data clusters. This
affects the choice of m in the algorithm, as well as shifts
most of the energy in the embedding to background clusters.
However, it is possible to exploit this characteristic by using
(5) to create a segmentation function that easily differentiates
small clusters. We plan to extend this analysis to other
dimension reduction techniques in future work.
R EFERENCES
(a) Graph Cut
(b) Clusters
ρ50,0.001
Generated
by
Fig. 5. Class separation for two moons example with |C1 | = 9, 500, and
|C2 | = 500. Clusters generated by ρ50,0.001 incorrectly classified 3 of the
10,000 points in the example.
V. E XAMPLES
Throughout this paper, we have used the synthetic example
of the two moons dataset from Figure 1(c). However, Theorems II.2 and III.5 are completely general within the aforementioned class of graphs with weakly connected clusters.
To demonstrate this, we shall examine a dataset of radiation
detection collected by the Unmanned Systems Lab (USL) at
Virginia Tech using systems mounted on an unmanned aircraft
(UAV) Yamaha RMAX helicopter [3]. The UAV collected
a low-altitude aerial mapping of spectral data. During the
experiment, the two radioactive sources were present, .084 Ci
137Ce and .00048 Ci 133Ba. The flight path, as well as the
location of the strongest radiological spectra, can be seen in
Figure 6(a).
Using this data, we built a 10 nearest neighbor graph
based on Euclidean distances between the radiological spectra
at each location. The first eigenvector concentrated on the
anomalous cluster emerges in the third smallest eigenvector,
but the next emergence of such an eigenvector is not until the
[1] M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality
reduction and data representation. Neural computation, 15(6):1373–
1396, 2003.
[2] M. Belkin and P. Niyogi. Convergence of laplacian eigenmaps. In NIPS,
pages 129–136, 2006.
[3] J. Benedetto, A. Cloninger, W. Czaja, T. Doster, K. Kochersberger,
B. Manning, T. McCullough, and M. McLean. Operator-based integration of information in multimodal radiological search mission with
applications to anomaly detection. Proc. SPIE, 9073:90731A–90731A–
9, 2014.
[4] G. Chen, G. Davis, F. Hall, Z. Li, K. Patel, and M. Stewart. An
interlacing result on normalized laplacians. SIAM J. Discret. Math.,
18(2):353–361, Feb. 2005.
[5] R. R. Coifman and S. Lafon. Diffusion maps. Applied and computational
harmonic analysis, 21(1):5–30, 2006.
[6] C. Davis and W. Kahan. The rotation of eigenvectors by a perturbation
III. SIAM Journal on Numerical Analysis, 7(l), 1970.
[7] I. Dumitriu and S. Pal. Sparse regular random graphs: Spectral density
and eigenvectors. The Annals of Probability, 40(5):2197–2235, Sept.
2012.
[8] B. McKay. The Expected Eigenvalue Distribution of a Large Regular
Graph. Linear Algebra and its Applications, 10017:203–216, 1981.
[9] S. T. Roweis and L. K. Saul. Nonlinear dimensionality reduction by
locally linear embedding. Science, 290(5500):2323–2326, 2000.
[10] A. Ruhe. Perturbation bounds for means of eigenvalues and invariant
subspaces. BIT Numerical Mathematics, 10:343–354, 1970.
[11] A. Singer and R. R. Coifman. Non-linear independent component
analysis with diffusion maps. Applied and Computational Harmonic
Analysis, 25(2):226–239, 2008.
[12] G. W. Stewart and J. G. Sun. Matrix perturbation theory. Academic
press, 1990.
[13] A. Szlam and X. Bresson. A total variation-based graph clustering
algorithm for cheeger ratio cuts. UCLA CAM Report, pages 09–68,
2009.
[14] H. Weyl. Das asymptotische verteilungsgesetz der eigenwerte linearer
partieller differentialgleichungen. Math. Ann., 71:441–479, 1912.
Download