Hierarchical Affinity Propagation

advertisement
Hierarchical Affinity Propagation
Inmar E. Givoni, Clement Chung,
Brendan J. Frey
outline
• A Binary Model for Affinity Propagation
• Hierarchical Affinity Propagation
• Experiments
A Binary Model for Affinity Propagation
AP was originally derived as an instance of the
max-product (belief propagation) algorithm in a
loopy factor graph.
What’s factor graph?
• Definition: A factor graph is a bipartite graph that
expresses the structure of the factorization. A
factor graph has a variable node for each variable
𝑥𝑖 , a factor node for each local function 𝑓𝑗 , and an
edge-connecting variable node 𝑥𝑖 to factor node
𝑓𝑗 if and only if 𝑥𝑖 is an argument of𝑓𝑗 .
Definition:所谓factor graph(因子图),就是对函数因子分解的表示
图,一般内含两种节点,变量节点和函数节点。我们知道,一个
全局函数能够分解为多个局部函数的积,因式分解就行了,这些
局部函数和对应的变量就能体现在因子上。
Eg: 𝑔 𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 , 𝑥5 是一个五个变量的函数,假设g可以表达成:
𝑔 𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 , 𝑥5 = 𝑓𝐴 (𝑥1 ) 𝑓𝐵 (𝑥2 ) 𝑓𝐶 (𝑥1 , 𝑥2 , 𝑥3 ) 𝑓𝐷 (𝑥3 , 𝑥4 )𝑓𝐸 (𝑥3 , 𝑥5 )
A factor graph for the product𝑓𝐴 (𝑥1 ) 𝑓𝐵 (𝑥2 ) 𝑓𝐶 (𝑥1 , 𝑥2 , 𝑥3 ) 𝑓𝐷 (𝑥3 , 𝑥4 )𝑓𝐸 (𝑥3 , 𝑥5 )
• The Max-Sum Update Rules
变量节点发给函数节点的消息:是变量节点
收到其他与之关联的函数节点发来的消息的
和。
where the notation ne(x)\ f is used to indicate
the set of variable node x’s neighbors excluding
function node f .
函数节点发给变量节点的消息:是函数节点的
值和其他变量节点发给它的消息累加和的最大
值。
• ne( f )\x is used to indicate the set of function
node f ’s neighbors excluding variable node x.
A binary variable model for affinity propagation
• ci j = 1 if the exemplar for point i is point j.
• 𝐼𝑖 , 𝐸𝑗 , 𝑆𝑖𝑗 :the I function nodes, in every row i of the
grid, exactly one ci j , j ∈ {1, . . . , N}, variable must be set
to 1. The E function nodes enforce the exemplar
consistency constraints; in every column j, the set of ci j ,
i ∈ {1, . . . , N}, variables set to 1 indicates all the points
that have chosen point j as their exemplar.
the 𝐼𝑖 , 𝐸𝑗 , 𝑆𝑖𝑗 function nodes
• We derive the scalar message updates in the
binary variable AP model. Recall the max-sum
message update rules.
• The scalar message difference βi j (1) − βi j (0)
is denoted by βi j . Similar notation is used for
α, ρ, and η. In what follows, for each message
we calculate its value for each setting of the
binary variable and then take the difference.
• The αi j messages are identical to the AP
availability messages a(i, j),and the ρi j messages
are identical to the AP responsibility messages r
(i, j).Thus, we have recovered the original affinity
propagation updates.
Hierarchical Affinity Propagation
• Goal: to solve the hierarchical clustering problem.
• What’s hierarchical clustering ?
层次聚类算法与之前所讲的聚类有很大不同,它
不再产生单一聚类,而是产生一个聚类层次。说
白了就是一棵层次树。
层次聚类算法可分为凝聚(agglomerative,自底向
上)和分裂(divisive,自顶向下)两种。自底向
上,一开始,每个数据点各自为一个类别,然后
每一次迭代选取距离最近的两个类别,把他们合
并,直到最后只剩下一个类别为止,至此一棵树
构造完成。自顶向下与之相反过程。
Model
• Goal: We propose a hierarchical exemplar
based clustering objective function in terms of
a high-order factor-graph, and we derive an
efficient approximate loopy max-sum
algorithms.
• We wish to find a set of L consecutive layers
of clustering, where the points to be clustered
in layer l are constrained to be in the exemplar
set of layer l-1.
• (a) HAP factor-graph, a single layer of the
standard AP model is shown in the dotted
square. (b) HAP messages.
Differences
1.The main difference compared to the at representation
is manifested in the
functions:
if point i is not chosen as an exemplar at layer l-1, (i.e. if
= 0), then point i will not be clustered at layer l.
Alternatively, if point i is chosen as an exemplar at layer l1, it must choose an exemplar at layer l.
• 2
• We note the 1ij messages passed in the first layer and
the Lij messages passed in the top-most layer are
identical to the standard AP messages for an AP layer.
Experiments
2D synthetic data
Analysis of Synthetic HIV Sequences
Figure : 2D synthetic data: comparison of objective Eq. (8) achieved by HAP and
its greedy counterpart (Greedy).
Top:Median percent improvement of HAP over Greedy for a given number of
layers used.
Bottom: Scatter plots of the net similarity achieved by HAP v.s. Greedy.
Experiments for which HAP obtains better results than Greedy are below the
line. Total percent of settings where HAP outperforms Greedy is reported in the
inset. Color in scatter-plot indicates the
number of layers.
First, we plotted precision v.s. recall for various clustering settings
• Synthetic HIV data: precision-recall for HAP, Greedy,
HKMC and HKMeans applied to the problem of
identifying ancestral sequences from a set of 867
synthetic HIV sequences. For HKMC and HKMeans, we
only plot the best precision obtained for each unique
recall value.
• Synthetic HIV data: distribution of Rand index for
different experiments using HAP and Greedy. A higher
Rand index indicates the solution better resembles the
ground truth. Experiments for which HAP obtains
better results than Greedy are below the line. The
percentage of solutions that identified the correct
single ancestor sequence at the top layer (layer 4) is
also reported.
Download