There are several ways to use the Heat Kernel Signature to compute

advertisement
Report: Heat Kernel Signature
Dafne van Kuppevelt, Vikram Doshi, Seçkin Savaşçı
Introduction
The Heat kernel can be thought of as the amount of heat that is transferred from x to y in time t given a
unit heat source at x. The Heat Kernel Signature (or HKS), is obtained by restricting the well-known heat
kernel to the temporal domain. Our work described in this report is an implementation of the Heat
Kernel Signature, including retrieval and visualization parts. To be more precise, our goal was to
compute the HKS of 684 given 3D models and find a useful way to use the HKS for retrieval.
To mark some of our implementation details here:





We used mainly C#, but we also have a small part written in C++.
Development is done in Visual Studio.
In our final version, all external libraries have non-commercial license.
Our project is hosted in Google Code: http://code.google.com/p/infomr-group-x
For proof of correctness we’ve provided both visualization and graphs. The visualizer of 3D
Models is developed on the XNA framework.
A brief word about the method
For compact M, the heat kernel has the following eigen decomposition:
∞
𝑘𝑡 (𝑥, 𝑦) = ∑ 𝑒 −𝜆𝑖𝑡 𝛷𝑖 (𝑥)𝛷𝑖 (𝑦)
𝑖=1
Where 𝜆𝑖 and 𝛷𝑖 are the ith eigen value and the ith eigen function of the Laplace-Beltrami operator,
respectively.
Rather than using the heat kernel itself as a point signature, the Heat Kernel Signature (or HKS) is used
by considering its restriction to the temporal domain. It has been proved in the ‘A Concise and Provably
Informative Multi-Scale Signature Based on Heat Diffusion’ that HKS inherits many properties from the
heat kernel, such as being invariant under perturbations of the shape- and therefore can be used in
applications that involve deformable shapes. More remarkably, it has also been proved that under
certain mild assumptions, the set of Heat Kernel Signatures at all points on the shape, fully characterizes
the shape. This means that the Heat Kernel Signatures are concise and easily commensurable since they
are defined over the common temporal domain, but at the same time, preserve almost all of the
information contained in the heat kernel.
The signature captures information about the neighborhood of a point on a shape by recording the
dissipation of heat from the point onto the rest of the shape and back to itself over time. Because heat
diffuses to progressively larger neighborhoods, the varying time parameter helps describe the shape
around a point. This means, in particular, that the detailed, highly local shape features are observed
through the behavior of heat diffusion over short time, while the summaries of the shape in large
neighborhoods are observed through the behavior of heat diffusion over longer time.
Method
There are several ways to use the Heat Kernel Signature to compute the distance between objects. We
tried and implemented four different methods, each being an improvement on the previous one. In all
our methods we compute a feature vector for each model and compute the distance between models
using the L2-norm on this feature vectors.
Method 1
Our initial crude method was to compute the sum of the HKS for all vertices:
𝑁
𝐻𝐾𝑆𝑆𝑢𝑚𝑡 = ∑ 𝑘𝑡 (𝑥, 𝑥)
𝑥=1
We compute this for 8 different t’s, uniformly sampled on the logarithmic scale on [-4, 2]. This results in
a feature vector of length 8 for each model.
The disadvantage of this method is that the t’s are fixed among all models and relatively arbitrary
chosen. Thus the time points may not be very representative for the model: for example in a larger
model it takes more time for the heat to spread.
Method 2
We therefore used a more advanced method to sample the time points. For each model, sample 100
points in the logarithmically scale over the time interval [tmin, tmax], where:
𝑡𝑚𝑖𝑛 =
4 ln 10
𝜆𝑒𝑛𝑑
𝑡𝑚𝑎𝑥 =
4 ln 10
𝜆2
Here𝜆2 is the second eigenvalue and 𝜆𝑒𝑛𝑑 is the last eigenvalue.
We then again compute 𝐻𝐾𝑆𝑆𝑢𝑚𝑡 for each time point and retrieve a feature vector of length 100 for
each model. We noticed that the 𝐻𝐾𝑆𝑆𝑢𝑚𝑡 remains almost unchanged for t >tmax as it is mainly
determined by the eigenvector 𝛷2.
Our feature vector was of length 100.
Method 3
We can approximate the HKS by using only 300 eigenvalues and eigenvectors. We then get:
300
𝑘𝑡 (𝑥, 𝑥) = ∑ 𝑒 −𝜆𝑡 𝛷𝑖 (𝑥)𝛷𝑖 (𝑥)
𝑖=1
𝑁
𝐻𝐾𝑆𝑆𝑢𝑚𝑡 = ∑ 𝑘𝑡 (𝑥, 𝑥)
𝑥=1
We again sample the time points on a logarithmically scale over the time interval [tmin, tmax], but now
4 ln 10
𝑡𝑚𝑖𝑛 =
𝜆300
This method requires significantly less computation than the previous method, since we only need to
calculate 300 eigenvalues and eigenvectors. Also the computation for the HKS of one vertex takes less
time, namely O(300) instead of O(N).
Our feature vector was of length 100.
Method 4
In the previous methods, we sum over all vertices to get the signature of the complete model. This
means that points on the model are not compared to corresponding points on another model. To
accomplish this in the fourth method we find 200 distinct feature points on the model. These feature
points have the highest local maxima, using multi scaling:
𝑡=𝑚𝑎𝑥
𝑑[𝑡𝑚𝑖𝑛,𝑡𝑚𝑎𝑥] (𝑥, 𝑥
′)
𝑘𝑡 (𝑥, 𝑥)
−
𝑁
∑𝑖=1 𝑘𝑡 (𝑥, 𝑥 )
𝑡=𝑡𝑚𝑖𝑛
= ∑
− 𝑘𝑡 (𝑥′, 𝑥′)
∑𝑁
𝑖=1 𝑘𝑡 (𝑥′, 𝑥′)
This results in a feature vector of length 200 * 100 (200 vertices and 100 different t values for each
vertex) for each model.
Visualization results of highest n HKS selection : from L to R : 15,100,200,and 500
These persistent feature points shown above are the high curvature points which have a local maximum.
As you can see from the pictures, using all vertices will not efficiently give the feature vector. However
by taking a few vertices only, we will lose a lot of important information. Hence we choose to use 200
vertices.
Implementation
As soon as we became confident and we had a consensus about the ideas in our reference paper, we
started the implementation. Rather than implementing all sequence together, we divided our
implementation into 4 stages: the Initial Parser is where we compute Laplacian matrices from 3D models
and the eigenvalues and -vectors from those Laplacian matrices; using these eigenvalues and vectors,
we calculate the HKS feature vectors in the HKS Calculation stage; we focused on matching and distance
calculations and the resulting performance measures in Retrieval; in theVisualization stage we visualize
our results of the Heat Kernel Signature calculations.
Initial Parser
The Initial parser contains the operations that take much time, but luckily only need to be done once.
These include the computation of the Laplacian matrices and the corresponding eigenvalues and
eigenvectors. The main challenge in this stage is to find efficient techniques to perform this calculations
and make sure that we can use the results of this stage in later stages easily.
a. Computing Laplacian Matrices
We tried to write the code for the computations ourselves as much as possible, without using external
libraries. Our goal was to implement a method that parses 3D models and outputs the Laplacian
matrices to text files, one by one. Our first successful implementation was using integers for keeping
each element of the resulting Laplacian matrix; also it was giving the output in human readable format.
But tests showed that it was slow and too much space consuming. Our test cases showed that if we
wanted to parse every model, it would take more than 2 hours to finish. Also for the largest models (in
terms of vertex count), it would go beyond x86 address space. To be more precise, our largest model
has 45000 vertices, which results in a matrix data structure of size 45000 ∗ 45000 ∗ 4 𝐵 =
7.54 𝐺𝐵 which is greater than our accessible address space (2 GB, OS restriction) in Windows, also
greater than 232 Bytes which we can address in x86.
We came up with a solution, namely reducing our matrix and using a bit to represent each element of
thematrix. The Laplacian matrix is a sparse symmetric matrix, where one can compute the diagonal
values by traversing the corresponding row or column. We therefore decided to store only the lower
triangular part of it and we also omitted the diagonal part because it can be computed by row traversal.
The resulting datais a lower triangular matrix without the diagonal part.By definition of the Laplacian
matrix, elements in this part can only take the value -1 or 0. We replaced-1s with 1s, so each element
could be stored using a bit. If the program calls an upper triangle’s element we send the lower part
correspondence, and if the diagonal part element is needed, we compute it on the fly. By using this
schema, we were able to parse all files in 2 minutes including the largest ones that we couldn’t parse in
45000∗44999
2
the previous method. To be more precise again, we then needed only
𝑏𝑖𝑡𝑠 = 120 𝑀𝐵 space
for each file. As a result we get a ~64X compression.
Figure 1: Simple Model; Table 1: Our First Implementation; Table 2: Compressed Implementation
The next decision we had to make was what storage format we would use for these Laplacian matrices.
Our first implementation was giving a human readable text file, in which we converted the values to
strings and added white spaces and line feeds for readability. However, such formatting increases the
parsing time. In addition, we had to implement a loader method that loads these matrices into memory
when we need it. We eventually decided to serialize our matrix data to files. With this decision, our
resulting files were no longer human-readable, but we could simply deserialize it in our program to use
it for further calculations.
b. Computing Eigen Vectors & Values
Computing eigenvalues and eigenvectors is a challenging problem, because of its high complexity. As we
stated before, at first we tried to implement our own eigen decomposer. But after some considerable
effort, we didn’t manage to implement a working version; so we decided to use a 3rd party library for the
eigen decomposition. In literature, there are specialized techniques - such as Lanczos algorithm- for
different kind of matrices, so we had to find a good library that fits to our input laplacian matrices which
are sparse symmetric matrices. We could not find a library that had a specialized eigen decomposer for
our type of matrices, so we started to try every library that we could find on the web, and evaluated
them.
library
Spec
License
Time
Memory
Math.NET
C# x86 Sequential LAPACK
xGPL
20+ mins
Average
DotNumerics
C# x86 Sequential LAPACK port
xGPL
2 mins
Average
CenterSpace
C# x86 Sequential LAPACK
$1295
30+ mins
Good
Eigen
C++ x86 Sequential UNIQUE
xGPL
2 hours
Good
Table 3: Evaluation results for 2000 vertex model
After the evaluation of some libraries, we decided use the DotNumerics library. Using DotNumerics, we
found a solution to eigen decomposition, but it had some drawbacks in our project:


We had to ignore the largest ~40 matrices: DotNumerics uses double precision for storing matrix
elements, and although DotNumerics has a symmetric matrix data storage class, it results in a
huge memory need which is more than x86 architecture can provide for the largest matrices.
We needed to create a library copy of our Laplacian matrices during parsing, because
DotNumerics uses LAPACK routines under the hood and that means we cannot use our bitwise
matrix format as an input for Eigen Decomposition Methods.
We also continued using serialization for outputs of the library. Computed eigenvalues and eigenvectors
are serialized to corresponding files.
c. An experimental Approach for Future Implementations
Although we used DotNumerics through our project and ignored the largest matrices, we came up with
a prototype for future work. Our prototype is written in C++ rather than C#, uses Intel Math Kernel
Library©, Boost and a tweaked version of Armadillo libraries. This prototype parser runs on x64
architecture and it takes only 10 seconds to compute the eigenvalues and eigenvectors of a 2000 vertex
model (recall that DotNumerics does the same job in 2 minutes). We had two main reasons for
developing this prototype. At first, today’s CPUs are mostly x64 capable, and we wanted to see how
much it could improve if we run the same implementation on x64. Secondly, an x86 Windows OS limits
the available memory to 2 GB per process; we need more memory than that if we want to parse the
largest matrices. However, the performance of this prototype is not better. We tested this prototype on
a machine that has 3GB main memory, the rest of the data is stored on virtual memory. For the largest
matrices, the disk I/O and as a result of it, the computation time increased dramatically.
HKS Calculation
In the final version of our program, we calculate HKS feature vectors in three different ways:
corresponding to the second, third and fourth method described earlier. The implementation of these
calculations is straightforward: for each model, the program reads the eigenvector and eigenvalues files
and uses these to compute the feature vectors.
Retrieval
In the Retrieval program, the null-transformation of each model is found and used as a query. For
increasing scope size, the program retrieves the closest items (in terms of L2-distance between the
feature vectors) and computes the recall, specificity and precision of the retrieval. This results in values
for an ROC-curve and a Recall/Precision graph for each model.
Visualization &Proof of correctness
To show that our computations are correct, we visualized models with their HKS.
For the 3d model visualizer, we used XNA framework. Our implementation reads a 3D model and the
corresponding HKS value file and draws a 3D scene. One can navigate in this scene using an XBOX
controller. Also if provided, you can see the different results of model in different time values.
Intuitively kt(x,x) can be interpreted as probability of Brownian motion t return to the same point x after
time t i.e. the weighted average over all paths possible between x and x in time t.
Below we have shown how visually the heat transfer takes place in the model for the different time step
t. When t is infinitesimally small (tending to 0, but not 0) then the object is completely red, since we
apply a unit source of heat at each vertex- since a very small amount of heat is transferred. Moreover
the parts with a higher curvature always show more redness or heat content. However as time moves
the object gets colder and colder and from red it turns green and eventually blue.
Color changes under different t values: from left to right t values increase
Performance
All three methods using the Heat Kernel Signature seem to work well for some types of transformations
and not at all for other types. This is very well visible in the ROC curves: for most models, the curve is flat
with a very high specificity at the beginning, and then suddenly drops. This means that the first models
retrieved are all of the same class, and after it drops, the remaining objects of the same class are hard to
distinguish from objects of other classes.
It is also clear from the ROC curves that the methods are improvements of the previous one. Whereas
the first method does not work very well for all models (for example the Flamenco curve is very low),
the second method has high curves for all models. The ROC curves for the third method are even higher.
Method 1: Sum of all vertices
1
Man
0.9
Dog
0.8
Cat licking her hand
0.7
Man
0.6
Woman
Stationary Horse
0.5
Camel
0.4
Cat
0.3
Elephant
0.2
Flamenco
0.1
Running Horse
Lion
0
0
0.2
0.4
0.6
0.8
1
Figure 2: ROC curve for Sum of all vertices
1
0.9
Man
0.8
Dog
Cat licking her hand
0.7
Man
0.6
Woman
0.5
Stationary Horse
Camel
0.4
Cat
0.3
Elephant
0.2
Flamenco
Running Horse
0.1
Lion
0
0
0.2
0.4
0.6
0.8
1
Figure 3: Recall/Precision graph of Sum of all vertices
Method 2: Sum for 300 vertices
1
0.9
Man
0.8
Dog
Cat licking her hand
0.7
Man
0.6
Woman
0.5
Stationary Horse
Camel
0.4
Cat
0.3
Elephant
0.2
Flamenco
Running Horse
0.1
Lion
0
0
0.2
0.4
0.6
0.8
1
Figure 4: ROC curve for Sum for 300 vertices
1
0.9
0.8
Man
0.7
Dog
0.6
Cat licking her hand
0.5
Man
0.4
Stationary Horse
0.3
Camel
0.2
Cat
Elephant
0.1
Running Horse
0
0
0.2
0.4
0.6
0.8
1
Figure 5: Recall/Precision graph for Sum for 300 vertices
Lion
Method 3: Scaled vertices
1
Man
0.9
Dog
0.8
Cat licking her hand
0.7
Man
Woman
0.6
Stationary Horse
0.5
Camel
0.4
Cat
0.3
Elephant
Flamenco
0.2
Running Horse
0.1
Lion
0
0
0.2
0.4
0.6
0.8
1
Figure 6: ROC curve for Scaled Vertices
1
0.9
Man
0.8
Dog
0.7
Cat licking her hand
Man
0.6
Woman
0.5
Stationary Horse
0.4
Camel
Cat
0.3
Elephant
0.2
Flamenco
0.1
Running Horse
Lion
0
0
0.2
0.4
0.6
0.8
1
Figure 7: Recall/Precision graph for Scaled Vertices
Conclusion
We have described the implementation of three different variations of the heat kernel signature. The
ROC and precision recall curves show that using one of these methods is a reliable way to match 3D
object.
The disadvantage of these methods is that computing the eigen decomposition is very costly operation
(using the DotNumerics library) . However, we expect the prototype we have created using Intel Math
Kernel Library©, Boost and a tweaked version of Armadillo libraries to give us faster and more efficient
results. The efficiency could be improved even more by not using a eigen decomposition but using a
different approach to calculate the heat kernel.
For the heat kernel to exist, the model should be a compact Riemannian manifold possibly with a
boundary, hence a signature method which works for holes, partial and view needs to be developed.
In addition, an implementation of the Laplace-Beltrami operator instead of using the Laplacian matrix
would give better results, since the entire geometry of the model is taken into consideration - not just
whether two vertices are adjacent (as in the case we implemented). Moreover, we also see the need for
more analysis to be done in finding the optimum number of scaled vertices and their corresponding
time steps.
Download