spectral clustering

advertisement
Spectral Clustering
Jianping Fan
Dept of Computer Science
UNC, Charlotte
Lecture Outline




Motivation
Graph overview and construction
Spectral Clustering
Cool implementations
2
Semantic interpretations of clustering clusters
3
Spectral Clustering Example – 2 Spirals
2
Dataset exhibits complex
cluster shapes
1.5
1
0.5
0
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
-0.5
-1
 K-means performs very
poorly in this space due bias
toward dense spherical
clusters.
-1.5
-2
0.8
0.6
0.4
0.2
In the embedded space
given by two leading
eigenvectors, clusters are
trivial to separate.
-0.709
-0.7085
-0.708
-0.7075
-0.707
-0.7065
0
-0.706
-0.2
-0.4
-0.6
4
-0.8
Spectral Clustering Example
Original Points
K-means (2 Clusters)
Why k-means fail for these two examples?
Lecture Outline




Motivation
Graph overview and construction
Spectral Clustering
Cool implementation
6
Graph-based Representation of Data Similarity
7
similarity
Graph-based Representation of Data Similarity
8
Graph-based Representation of Data Relationship
9
Manifold
10
Graph-based Representation of Data Relationships
Manifold
11
Graph-based Representation of Data Relationships
12
Data Graph Construction
13
14
Graph-based Representation of Data Relationships
15
Graph-based Representation of Data Relationships
16
Graph-based Representation of Data Relationships
17
18
Graph-based Representation of Data Relationships
Graph Cut
19
Lecture Outline




Motivation
Graph overview and construction
Spectral Clustering
Cool implementations
20
21
Graph-based Representation of Data Relationships
22
Graph Cut
23
24
25
26
27
Graph-based Representation of Data Relationships
28
Graph Cut
29
30
31
32
33
Eigenvectors & Eigenvalues
34
35
36
Normalized Cut
A graph G(V, E) can be partitioned into two disjoint sets A, B
Cut is defined as:
Optimal partition of the graph G is achieved by minimizing the cut
Min (
)
37
Normalized Cut
Normalized Cut
Association between partition set and whole graph
38
Normalized Cut
39
Normalized Cut
40
Normalized Cut
41
Normalized Cut
Normalized Cut becomes
Normalized cut can be solved by eigenvalue equation:
42
K-way Min-Max Cut
Intra-cluster similarity
Inter-cluster similarity
Decision function for spectral clustering
43
Mathematical Description of Spectral Clustering
Refined decision function for spectral clustering
We can further define:
44
Refined decision function for spectral clustering
This decision function can be solved as
45
Spectral Clustering Algorithm
Ng, Jordan, and Weiss

Motivation

Given a set of points
S  s1,..., sn   Rl

We would like to cluster them into k
subsets
46
Algorithm


Form the affinity matrix W  R
2
2
|| si  s j || / 2
DefineWij  e
if i  j
nxn
Wii  0


Scaling parameter chosen by user
Define D a diagonal matrix whose
(i,i) element is the sum of A’s row i
47
Algorithm
LD
1/ 2
1/ 2

Form the matrix

Find x1 , x2 ,..., xk , the k largest eigenvectors of
L
These form the the columns of the new
matrix X


WD
Note: have reduced dimension from nxn to nxk
48
Algorithm

Form the matrix Y





Renormalize each of X’s rows to have unit length
Yij  X ij /( X ij 2 )2
Y  R nxk j
Treat each row of Y as a point in R k
Cluster into k clusters via K-means
49
Algorithm

Final Cluster Assignment

Assign point si to cluster j iff row i of Y was
assigned to cluster j
50
Why?

If we eventually use K-means, why not just
apply K-means to the original data?

This method allows us to cluster non-convex
regions
51

Some Examples
52
53
54
55
56
57
58
59
60
User’s Prerogative


Affinity matrix construction
Choice of scaling factor



Realistically, search over
gives the tightest clusters

2
and pick value that
Choice of k, the number of clusters
Choice of clustering method
61
How to select k?

Eigengap: the difference between two consecutive eigenvalues.

Most stable clustering is generally given by the value k that
maximises the expression
 k  k  k 1
Largest eigenvalues
of Cisi/Medline data
50
λ1
45
40
 Choose k=2
Eigenvalue
max  k  2  1
35
30
25
λ2
20
15
10
5
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
K
62
Recap – The bottom line
63
Summary




Spectral clustering can help us in hard
clustering problems
The technique is simple to understand
The solution comes from solving a simple
algebra problem which is not hard to
implement
Great care should be taken in choosing the
“starting conditions”
64
Spectral Clustering
Spectral Clustering
Spectral Clustering
Spectral Clustering
Spectral Clustering
Spectral Clustering
Spectral Clustering
Spectral Clustering
Spectral Clustering
Spectral Clustering
Spectral Clustering
Spectral Clustering
Spectral Clustering
Download